2 Replies Latest reply on Aug 7, 2015 9:22 AM by thyvarin

    McAfee NGFW Cluster Load Balance:

    vduartebr

      Hello all,

       

      One of our customers.

       

      Has some problems to enable the McAfee NGFW Cluster Load Balance feature, as far I could check, it is due to some problems

      within their network, but to be sure that this is the real problem I would like to check with you guys if anyone have seen this before.

       

      When both Firewall nodes are online, apparently their switches cannot handle very well the cluster Mac Address exchange,

      and all the network goes down because of it.

       

      But according to the switches' manufacturer, the old firmware that the customer were using had some problems with

      High Availability and Routing Protocols, and after an update it would be solved, so we did it, and nothing has changed

      at all, even after a reset in the switches ARP table.

       

      So I did some tests in my Lab using the same switches that our customer is using, and I could check that even after

      the firmware update, the switch show in the logs:

       

      ############################ ARP_DUPLICATE_IPADDR_DETECT: Detected an IP address conflict. ############################

       

      The device with MAC address XXXXXXXXXX  connected to GigabitEthernet1/0/40 in VLAN XX and the device

      with MAC address YYYYYYYYYYY connected to GigabitEthernet1/0/52 in VLAN XX are using the same IP address.

       

      ############################-############################-############################-######################

       

      Using the default configuration method in the Firewall Heartbeat Interface, exactly how is described in the Configuration Guide, when the policy

      is applied, one of the Firewall nodes goes automatically to Standby mode or loses connectivity.

       

      So I changed the way how the Firewall Cluster was configured.

       

      The Primary Interface of the Heartbeat I set to Unicast Mac Address and the Second Interface to Packet Dispatcher, with these configurations I was

      able to apply the policy and the two firewall nodes started to work in Load Balance mode with no problems.

       

      Using the same method in our customer seemed to work fine, but in the next day the network showed some signs of instability and problems to work

      with the mac address exchanges again,the problem seemed to be something similar to the STP by blocking the way to reach the node 02 and after it everything went down again.

       

      If you ever saw something like it, please let me know.

       

      Thanks and Best Regards,

        • 1. Re: McAfee NGFW Cluster Load Balance:
          mhenttu

          Hi

           

          Heartbeat and state sync traffic are multicast so there is no changing IP:s for that. Also CVI mode affects how user traffic through cluster is handled not how heartbeat works. Source of the heartbeat  packets is interface NDI address and destination is multicast IP and MAC. So problem might be related switch multicast handling. If physically possible easy way to test this is connect NGFW heartbeat interfaces directly bypassing switches. This is also recommended practice in two node cluster setups.

           

          Error message that you copy pasted seems a bit odd. If HB has failed all nodes in cluster would change physical MAC address to CVI MAC address and claim to own CVI IP. This should cause that MAC addresses are same in error message. In normal operation there are no duplicate IP:s bacause of clustering.

          • 2. Re: McAfee NGFW Cluster Load Balance:
            thyvarin

            Hi,

             

            As an additional information, here's link to KB article that describes the requirements for the HB/state sync network:

            McAfee KnowledgeBase - HeartBeat/Sync Network Requirements for Next Generation Firewall

             

            As for the duplicate IP detection, my guess would be that dispatcher node changed and thus different node in cluster took ownership of CVI IP and MAC, and started using those. Switch then saw this as duplicate IP use. My recommendation would be to disable duplicate IP detection on ports connected to NGFW cluster as it is normal that CVI IP might move to other node (e.g. during failover).

             

            BR,

            Tero