4 Replies Latest reply: Nov 12, 2013 3:55 PM by flitcraft33 RSS

    cluster heartbeat ip cannot communicate

    mmendy

      Hi,

       

      im having problem with cluster configuration, before the firewall join cluster, both firewall able to communcate through hearbeat interface, but when both firewall join the cluster both heartbeat cannot communicate.

      below is config and debug file.

       

      what is missing in my configuration?

       

      ##########TSP1-CLUSTER CONFIGURATION##########
      tsp1:Admn {25} % cf cluster q
      cluster create type=peer_to_peer hb_zone=HeartBeat backup_hb_zone='' \
          password=812a841442 firewall_id=3 multicast_group=239.255.0.1 \
          default_l2_mode=multicast ping_wait=1 autoreset=on
      cluster add name=tsp2.abcd.com address=192.168.168.2 \
          failover_time=13 password=abcd1234
      cluster add name=tsp1.abcd.com address=192.168.168.1 \
          failover_time=13 password=abcd1234
      tsp1:Admn {26} %

      ##########TSP2-CLUSTER CONFIGURATION##########
      tsp2:Admn {103} % cf cluster q
      cluster create type=peer_to_peer hb_zone=HeartBeat backup_hb_zone='' \
          password=812a841442 firewall_id=3 multicast_group=239.255.0.1 \
          default_l2_mode=multicast ping_wait=1 autoreset=on
      cluster add name=tsp2.abcd.com address=192.168.168.2 \
          failover_time=13 password=abcd1234
      cluster add name=tsp1.abcd.com address=192.168.168.1 \
          failover_time=13 password=abcd1234
      tsp2:Admn {104} %

      ##########TSP1-CLUSTER STATUS##########
      tsp1:Admn {26} % cf cluster status

                              HA Cluster Status Information
                              =============================

      Primary Host:        tsp1.abcd.com
      Primary IP Address:  192.168.168.1
      Cluster Zone:        HeartBeat
      Cluster Cert:        Default_Enterprise_Certificate
      Cluster CA:          Default_Enterprise_CA

      Member Name          State         IP Address
      -------------------- ------------- ---------------
      tsp2.abcd.com
          registered    192.168.168.2
      tsp1.abcd.com
          registered    192.168.168.1


                            Policy and Peer Connection Status
                            =================================

      tsp2.abcd.com (peer)
      -------------------------------------
          Connection State  :  Not Connected
          Last Dispatch     :  2013-10-10 11:56:11.026035
          Policy Version    :  533-1381394605.92-1381395194
          FW Version        :  8.3.1
          Status            :  Lost Connection

      tsp1.abcd.com (primary)
      -------------------------------------
          Connection State  :  Localhost
          Policy Version    :  533-1381394605.92-1381395360
          FW Version        :  8.3.1
          Status            :  Up to date - Current

      ##########TSP2-CLUSTER STATUS##########
      tsp2:Admn {104} % cf cluster status

                              HA Cluster Status Information
                              =============================

      Primary Host:        tsp2.abcd.com
      Primary IP Address:  192.168.168.2
      Cluster Zone:        HeartBeat
      Cluster Cert:        Default_Enterprise_Certificate
      Cluster CA:          Default_Enterprise_CA

      Member Name          State         IP Address
      -------------------- ------------- ---------------
      tsp2.abcd.com
          registered    192.168.168.2
      tsp1.abcd.com
          registered    192.168.168.1


                            Policy and Peer Connection Status
                            =================================

      tsp2.abcd.com (primary)
      -------------------------------------
          Connection State  :  Localhost
          Policy Version    :  533-1381394605.92-1381395564
          FW Version        :  8.3.1
          Status            :  Up to date - Current

      tsp1.abcd.com (peer)
      -------------------------------------
          Connection State  :  Not Connected
          Last Dispatch     :  2013-10-10 11:56:11.535830
          Policy Version    :  533-1381394605.92-1381395360
          FW Version        :  8.3.1
          Status            :  Lost Connection
      tsp2:Admn {105} %

      ##########PING FROM TSP1 TCPDUMP BY TSP2##########
      tsp1:Admn {27} % ping 192.168.168.2
      PING 192.168.168.2 (192.168.168.2): 56 data bytes
      ^C
      --- 192.168.168.2 ping statistics ---
      12 packets transmitted, 0 packets received, 100.0% packet loss
      tsp1:Admn {28} %

      tsp2:Admn {107} % tcpdump -npi 2-7
      tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
      listening on 2-7, link-type EN10MB (Ethernet), capture size 96 bytes
      12:16:35.842511 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x4bd):  failover 44
      12:16:36.179516 IP 192.168.168.3.45147 > 192.168.168.1.9004: Flags [S], seq 1143917768, win 32768, options [mss 1460,nop,wscale 3,sackOK,TS val 756176654 ecr 0], length 0
      12:16:36.310477 IP 192.168.168.3.36739 > 192.168.168.2.9004: Flags [S], seq 32952147, win 32768, options [mss 1460,nop,wscale 3,sackOK,TS val 1266566 ecr 0], length 0
      12:16:36.457487 IP 192.168.168.3 > 192.168.168.2: ICMP echo request, id 52745, seq 3, length 64
      12:16:36.842511 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x4be):  failover 44
      12:16:37.273477 IP 192.168.168.3.48740 > 192.168.168.2.9004: Flags [S], seq 3946403056, win 32768, options [mss 1460,sackOK,eol], length 0
      12:16:37.458489 IP 192.168.168.3 > 192.168.168.2: ICMP echo request, id 52745, seq 4, length 64
      12:16:37.842512 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x4bf):  failover 44
      12:16:38.459489 IP 192.168.168.3 > 192.168.168.2: ICMP echo request, id 52745, seq 5, length 64
      12:16:38.842515 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x4c0):  failover 44
      12:16:39.379490 IP 192.168.168.3.45147 > 192.168.168.1.9004: Flags [S], seq 1143917768, win 32768, options [mss 1460,nop,wscale 3,sackOK,TS val 756179854 ecr 0], length 0
      12:16:39.460492 IP 192.168.168.3 > 192.168.168.2: ICMP echo request, id 52745, seq 6, length 64
      12:16:39.510485 IP 192.168.168.3.36739 > 192.168.168.2.9004: Flags [S], seq 32952147, win 32768, options [mss 1460,sackOK,eol], length 0
      ^C
      13 packets captured
      13 packets received by filter
      0 packets dropped by kernel
      tsp2:Admn {108} %

      ##########PING FROM TSP2 TCPDUMP BY TSP1##########
      tsp2:Admn {108} % ping 192.168.168.1
      PING 192.168.168.1 (192.168.168.1): 56 data bytes
      ^C
      --- 192.168.168.1 ping statistics ---
      16 packets transmitted, 0 packets received, 100.0% packet loss
      tsp2:Admn {109} %

      tsp1:Admn {29} % tcpdump -npi 2-7
      tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
      listening on 2-7, link-type EN10MB (Ethernet), capture size 96 bytes
      12:17:58.335814 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x510):  failover 44
      12:17:58.722067 IP 192.168.168.3 > 192.168.168.1: ICMP echo request, id 39991, seq 10, length 64
      12:17:59.335826 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x511):  failover 44
      12:17:59.723060 IP 192.168.168.3 > 192.168.168.1: ICMP echo request, id 39991, seq 11, length 64
      12:18:00.275046 IP 192.168.168.3.29783 > 192.168.168.1.9004: Flags [S], seq 646101425, win 32768, options [mss 1460,sackOK,eol], length 0
      12:18:00.335830 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x512):  failover 44
      12:18:00.724063 IP 192.168.168.3 > 192.168.168.1: ICMP echo request, id 39991, seq 12, length 64
      12:18:01.335862 IP 192.168.168.3 > 239.255.0.1: AH(spi=0x03010000,seq=0x513):  failover 44
      12:18:01.725014 IP 192.168.168.3 > 192.168.168.1: ICMP echo request, id 39991, seq 13, length 64
      ^C
      9 packets captured
      9 packets received by filter
      0 packets dropped by kernel
      tsp1:Admn {30} %

       

      Thanks

        • 1. Re: cluster heartbeat ip cannot communicate
          PhilM

          If you are unable to ping the HA (HeartBeat) zone IP address from one appliance to the other have you enabled the "Respond to ICMP echo and timestamp" setting in the Network -> Zone Configuration screen for each Firewall.

           

          If you have and they are still not responding to each other then the problem could be a physical one. I have always been told that the HA connection should be point-to-point and not via a switch.

           

          Next I would make sure that the interface in question (which, based on your TCP dumps appears to be 2-7) is actually reporting an active connection. Run "ifconfig 2-7" to see what the reported media speed is (making sure it is the same on both appliances) and that the status is "active".

           

          With the HA configuration being driven by a wizard these days, as long as you have satisfied the basic requirements and you are not dealing with a faulty interface on one unit or the other, it should just work.

           

          The only other hurdle is the zone names themselves. They must be identical on both appliances and must be created in the same order. If you run the "region" command you will see the zones listed along with their numeric identifier. If HeartBeat on tsp1 is zone 4, but on tsp2 is zone 3 then the HA cluster process will fail.

           

          -Phil.

          • 2. Re: cluster heartbeat ip cannot communicate
            sliedl

            The first tcpdump switches back and forth between two different dest. addresses:

            12:16:36.179516 IP 192.168.168.3.45147 > 192.168.168.1.9004: Flags [S], seq 1143917768, win 32768, options [mss 1460,nop,wscale 3,sackOK,TS val 756176654 ecr 0], length 0

            12:16:36.310477 IP 192.168.168.3.36739 > 192.168.168.2.9004: Flags [S], seq 32952147, win 32768, options [mss 1460,nop,wscale 3,sackOK,TS val 1266566 ecr 0], length 0

             

            You must have something configured incorrectly somewhere.

            • 3. Re: cluster heartbeat ip cannot communicate
              mmendy

              i create the configuration from the scrat,  format and install FW1 and FW2

              cluster is form but strange behaviour ?

              if fw1 shutdown, fw 2 not become active and cluster ip is not pingable

              • 4. Re: cluster heartbeat ip cannot communicate
                flitcraft33

                Make sure that if you are connecting the heartbeat burb interfaces that you have calbled the correct ports together, and that you are using a crossover ethernet cable, not a regular one if they are diectly connected without a switch in between. Also check speed and duplex settings on the ports.  Some of these boxes have some weird arrangement of ports. If you are unsure, connect them to your switch one by one and teste the state of the port in the heartbeat burb. You can find which ports are the heartbeat burb ports that way. Also check subnet masks on each port, make sure they are the same.