4 Replies Latest reply: Jul 15, 2013 4:10 PM by cnewman RSS

    PAC Client-IP Load Balancing, myipaddress() inconsistencies, IPv6 and more

    phlrnnr

      As time has allowed over the past few months I've been thinking about ways to more evenly spread the load across our proxies using our PAC file.  For now, I'd like to stay away from using a hardware load-balancer to do this, but want to focus this discussion on ways to do this using a standard proxy.pac file.

       

      Historically, we've tried to balance traffic based on the source IP of a machine using myIpAddress() and isInNet() function calls, sending traffic from certain subnets to specific proxies.  However, we've had inconsistent results.  myIpAddress() has historically had problems returning the proper IP address that the browser is actually using - sometimes returning the IP of the wrong adapter (if multiple adapters are installed) or returning the IPv6 address of an adapter.  I've even seen the localhost IP returned.  myIpAddress() also seems to be implemented differently across browsers, as different browsers on the same machine sometimes return different results.  See these links for some examples:

       

       

      I've tried playing with the FindProxyForURLEx(url, host) functions as well - these are supposed to return ALL IP addresses, inclucing IPv6, in an array for the user to parse.  But I had limited success; interoperability across browsers didn't seem to be there.

       

      All these things seemed to nullify our "old" way of associating a certain subnet to a specific proxy.  Depending on how many adapters one has (most people have at least 2 on a laptop - wired and wireless), and if IPv6 is enabled (whether it is being used or not), there is a chance you will end up hitting the "default" proxy (the proxy you would hit if the source IP doesn't match any of the subnets associated with a proxy) even if your IP should have matched one of the subnets.

       

      So, how to LB across multiple proxies, taking into account the inconsistencies with myIpAddress()?  I started my journey implementing something similar to what is found here: https://community.mcafee.com/message/187327#187327 and https://community.mcafee.com/message/177244#177244.  That worked great until the myIpAdress() function returned the IPv6 address of my adapter and I hit the default proxy instead of the one I was hoping to hit.  If all clients behaved similarly, we wouldn't have any load balancing, but one very tired proxy, and several very fast ones!  As that wouldn't be any good, I tweaked my code to check for both IPv4 or IPv6 addresses.  Here is what I ended up with:

       

      ___________________________________________________

      function FindProxyForURL(url, host) {

          var myIP = myIpAddress();

          function chooseProxy()

          {

          var numProxiesInService = 4 //change this to determine which proxies are in service.  Only the first numProxiesInService will be used.

          var proxy0 = "PROXY x.x.x.1:8080; PROXY x.x.x.2:8080";

          var proxy1 = "PROXY x.x.x.2:8080; PROXY x.x.x.3:8080";

          var proxy2 = "PROXY x.x.x.3:8080; PROXY x.x.x.4:8080";

          var proxy3 = "PROXY x.x.x.4:8080; PROXY x.x.x.1:8080";   

       

          // Load Balance Across Proxies by getting the remainder / mod (%)

          //  to determine which proxy to send traffic to.

              if(myIP.indexOf(".") != -1) //If an IPv4 address

                  {

                  var octets = myIP.split(".");

                  var fourthOctet = parseInt(octets[3]);

                  var modValue = fourthOctet % numProxiesInService;

                  }   

              else if (myIP.indexOf(":") != -1) //If an IPv6 address

                  {

                  var ip6parts = myIP.split("%"); //split off the zone index

                  var ip6addr = ip6parts[0];

                  var lastChar = parseInt(ip6addr.charAt(ip6addr.length-1), 16);

                  var modValue = lastChar % numProxiesInService;

                  }   

              else  //all other cases, return proxy0

                  { return proxy0; }

       

              // Choose a proxy based on the modValue

              switch (modValue){

                  case 0:

                      return proxy0;

                      break;

                  case 1:

                      return proxy1;

                      break;

                  case 2:

                      return proxy2;

                      break;

                  case 3:

                      return proxy3;

                      break;

                  }

          }

       

          <<All code to send appropriate internal stuff DIRECT>>

       

          //No other Exceptions, so LB across the proxies

          return chooseProxy();

      }

      ___________________________________________________

       

      It seems to work fairly well, and in my limited testing seems to handle both IPv4 and IPv6 addresses.  I hope this will help other people who may have struggled with the same issues I've been running into.

       

      There are a lot of smart people in this community.  Have you run into these issues before?  If so, what have you done to solve them?  What tricks do you have up your sleeve that could help address these issues?  Do you see any problems with my config that might cause me pain down the road?

        • 1. Re: PAC Client-IP Load Balancing, myipaddress() inconsistencies, IPv6 and more
          cnewman

          Hi Phlrnnr,

           

          I think you have already figured out the problems with using myipaddres. I do think that if you have 4 devices, physical lb would be the way to go, but let's talk about a way to do what you do currently without using myipaddress.

           

          There are two commonly used options:

           

          1) Use intelligent DNS (netmask assignment, F5 GTM) or equivalent, always return proxy.company.com, but it equals different things depending on who does the lookup.

           

          2) Script the pac file returned on the webserver side. Over the years I've seen admins use cgi, perl, php and the MWG itself to dynamically return the contents of the pac file. In the case of MWG you would create a pac file with a variable like [var1] and then when the file is requested throught the MWG you could replace the variable with client.ip or whatever other property you want. You can do the same on the web server using some sort of server interpreted language.

          https://community.mcafee.com/docs/DOC-4916

           

          --Christopher

          • 2. Re: PAC Client-IP Load Balancing, myipaddress() inconsistencies, IPv6 and more
            Jon Scholten

            I'll take a small crack at this. I'm not that well versed with PAC files in the field, there are much more experienced people than I with them.

             

            All of the scenarios you listed, require the client to figure out their own IP.

             

            Why not have the server that is hosting the PAC file write the requested client's IP into the PAC file that they use? Aka a dynamic pac file.

             

            Just a thought, perhaps it hasnt been considered.

             

            Best,

            Jon

             

            Message was edited by: jscholte NEWMAN!!! on 7/11/13 3:49:28 PM CDT
            • 3. Re: PAC Client-IP Load Balancing, myipaddress() inconsistencies, IPv6 and more
              phlrnnr

              Jon and Christopher, thank you very much for your replies!  I've never thought about a dynamically generated PAC file where the server determines the client IP instead of making the client figure it out.  I'll have to look more into that and see if that could be feasible in our environment.  I could see that being helpful in making subnet based LB much more accurate and useful.

               

              With myipaddress() being such an inconsistent function (you can never rely on it for an accurate answer), it can't concretely be used to positively identify a user's IP.  However, it could still be fairly accurately used for load balancing within the PAC file.  Even if the IP it returns is wrong, with the exception of 127.0.0.1, it should be reliable for using mod to choose a proxy to direct the traffic to and balance things fairly.  It should also work well for making sure a client is sticky to a particular proxy.  That being said, to avoid the IPv4 vs. IPv6 stuff in the PAC file (I could see different clients returning IPv6 addresses differently - especially with how they represent a zone index if one is included, or if an address ended in a :: or something weird like that), do you think it would be possible to hash whatever is returned by myipaddress() with MD5 or some other hashing function and then do a mod on that value instead?  If it were possible, do you think it would even be smart?

               

              I'm just trying to think outside the box.  What do the rest of you think?  Have any of you ever done something like this before?

              • 4. Re: PAC Client-IP Load Balancing, myipaddress() inconsistencies, IPv6 and more
                cnewman

                Well, I have seen people encode to try to deal with that, although in my mind it seems simpler to just use different logic for 4 vs 6.

                I would caution that myipaddress can return 127.0.0.1 and 169.254 type addresses, so the split will not be pure. But it's up to you. Biggest problem with hashing/encoding and then doing modulus on it is that it's not as easy to know which proxy a user is using. With odds and evens, you can get the IP and know where to go for logs.

                 

                --Christopher