High Availability

The main cause for an Internet security system to fail is because of a hardware failure. The ability of any system to continue providing services after a failure is called failover. Sophos UTM on AWS provides high availability (HA) failover, allowing you to set up a hot standby system in case the primary system fails (active-passive). Alternatively, you can use Sophos UTM on AWS to set up a cluster, which operates by distributing dedicated network traffic to a collection of nodes (active-active) similar to conventional load-balancing approaches in order to get optimal resource utilization and decrease computing time.

Note – If your Sophos UTM on AWS runs on Amazon Web Service (AWS), see Management > Conversion Utility.

The concepts high availability and cluster as implemented in Sophos UTM on AWS are closely related. For a high availability system can be considered a two-node cluster, which is the minimum requirement to provide redundancy.

Each node within the cluster can assume one of the following roles:

All nodes monitor themselves by means of a so-called heart-beat signal, a periodically sent multicast UDP packet used to check if the other nodes are still alive. If any node fails to send this packet due to a technical error, the node will be declared dead. Depending on the role the failed node had assumed, the configuration of the setup changes as follows:

Note – HA settings are part of the hardware configurations and cannot be saved in a backup. This also means that HA settings will not be overwritten by a backup restore.

Reporting

All reporting data is consolidated on the master node and is synchronized to the other cluster nodes at intervals of five minutes. In case of a takeover, you will therefore lose not more than five minutes of reporting data. However, there is a distinction in the data collection process. The graphs displayed in the Logging & Reporting > Hardware tabs only represent the data of the node currently being master. On the other hand, accounting information such as shown on the Logging & Reporting > Network Usage page represents data that was collected by all nodes involved. For example, today's CPU usage histogram shows the current processor utilization of the master node. In the case of a takeover, this would then be the data of the slave node. However, information about top accounting services, for example, is a collection of data from all nodes that were involved in the distributed processing of traffic that has passed the unit.

Notes

Related Topics Link IconRelated Topics