The McAfee Email Gateway appliance provides for the ability to generate clusters of appliances.  A cluster may consist of two or more appliances.  This article will go into a few best practices for the creation and management of MEG clusters.

 

Cluster Creation

 

When setting up a cluster of MEG appliances, determine what your performance needs are.  If you are going to be handling a lot of mail, you will want more cluster members.  If you are not going to be handling as much mail, but are looking for the redundancy the cluster provides, you may just want two cluster members.

 

When configuring a cluster for the first time, choose a cluster ID that is not the default.  If you leave the default ID in place, this can result in new devices finding themselves added directly to the already existing cluster, even though you don't mean them to.  Once your cluster ID is set to something other than the default, it will be necessary to reimage the appliance to change the ID.  We use VRRP to do our clustering, so make sure that you note any other VRRP clusters on the same network the MEG will be on before setting this up.

 

Cluster members must be on the same local network in order to work.  Because we make use of VRRP, if appliances are present in different physical networks and are separated by a router, the devices will be unable to talk to each-other.  If they are separated by a wan link (even on the same VLAN), the devices may be unable to talk to each-other in a reasonable time, thus resulting in the boxes being unable to connect properly.  We do not support configuration of appliances into clusters incorporating a WAN link.

 

When creating clusters of virtual machines, it is necessary to ensure that either the VMs have direct access to the network to which the host machine is attached, *OR* all the cluster members are present in the same host device.  If not, cluster members may be unable to talk to each-other.

 

Clusters may have three types of devices in them:

1.  Cluster Master - This device is the main host in the cluster.  It acts as the primary traffic cop for inbound and outbound traffic, and handles all communications with the outside world.  It may or may not also host a scanning device. 

2.  Cluster Failover - This device is the backup host in the cluster.  Should the Master fail and go offline, the Failover appliance will take up the traffic cop duties until the Master comes back online.  If the Master hosts a scanner, this device will also host a scanner.

3.  Cluster Scanner - This is a standalone scanning device.  It receives its configuration, updates, and traffic to scan from the device currently handling all traffic for the cluster.

 

If a cluster has five or more appliances, the Master (and by extension, the Failover) should not be scanning traffic.  If a cluster has more than six devices, consider purchasing one of our MEG Blade servers instead.  If a cluster has three or fewer members, the Master and Failover devices should be scanners.  Clusters with exactly four members can go either way, as desired.

 

Cluster Administration

 

DO NOT use the configuration push feature built into the MEG appliances to push config from the Master to other devices in the same cluster.  KB82172 has additional details about the results of doing so.  Additionally, if using Configuation Push to push between clusters, push from the Master of one cluster to the Master of the other.  Never do config push to other devices in the destination cluster.

 

When booting your cluster, make sure that the Failover appliance boots first, then the Master.  Any scanners may be brought up any time after the Failover has come up.  Failure to boot in this order may result in communication issues between the master and failover appliances.


When performing software updates, ALWAYS install the update on the Failover first.  After updating the failover, allow it to come back online, then take down the master.  Dedicated scanning devices may be updated any time after the Failover update commences.  Note that if it is necessary to ensure mail flow and your master and failover devices are not scanners, it is necessary to update the failover and at least one scanner, THEN update the master and the rest of the scanners.

 

All cluster members must be running the same version of the software.  If a device in the cluster is on a different version of the software, it may receive traffic for scanning from the Master for a short time, once its configuration gets too far out of date (since the master can no longer update it), that device will stop being used to scan traffic.  Note that if the Failover appliance is the one on a different version, this may result in mailflow problems in the event of the Master becoming unavailable.

 

Cluster Reporting

 

When a cluster is properly formed, all reporting data gets passed to the Master appliance.  Should the Master fail, the Failover will not have the reporting data present on the Master, as it doesn't replicate that data.  Additionally, when the Master comes back online, the Failover's data will not be passed back to the master.  This is due to a limitation present in the way the cluster setup is performed.

 

External Device Integration

 

When integrating Clustered MEG appliances with ePO, only the Master should be connected.  The master and failover are the traffic cops for the cluster, providing logging data and accepting configuration changes.  Note, however, that the way the ePO currently handles the MEG data, connecting the Failover appliance to ePO will result in some dashboard data duplication on the ePO server.

 

When integrating with the MQM, make sure that the master and failover are using the default device ID.  Failure to do so will result in the Master's configuration being pushed to the Failover, and mail may not be quarantined properly (and thus may be unavailable for release).

 

For additional information, please see the following KB articles which cover some of the topics above.

https://kc.mcafee.com/corporate/index?page=content&id=KB76144&actp=null&viewloca le=en_US&showDraft=false&platinum_status=false&locale=en_US#Clustering

https://kc.mcafee.com/corporate/index?page=content&id=KB76204&actp=null&viewloca le=en_US&showDraft=false&platinum_status=false&locale=en_US