1 2 Previous Next 13 Replies Latest reply: Feb 28, 2011 10:43 AM by jont717 RSS

    How to load balance in .PAC file

    jont717

      We have 2 MWG 7.0.2.2

       

      In our current WCCP transparent setting, it load balances automatically in the Cisco ASA.

       

      How do we load balance in the .PAC file for direct proxy?  I know there is a way.  Anyone doing this? 

        • 1. How to load balance in .PAC file
          sroering

          This example shows how to do load balancing with fail-over by prioritizing the proxy based on the client IP address.

           

           

           

           

          function FindProxyForURL(url,host)

          {   

              if (isInNet(myIpAddress(), "10.1.0.0", "255.255.0.0"))   

                  return "PROXY ww-2.example.com:8080; " + "PROXY ww-1.example.com:8080";   

              if (isInNet(myIpAddress(), "10.2.0.0", "255.255.0.0"))   

                  return "PROXY ww-2.example.com:8080; " + "PROXY ww-1.example.com:8080";   

              if (isInNet(myIpAddress(), "10.3.0.0", "255.255.0.0"))   

                  return "PROXY ww-1.example.com:8080; " + "PROXY ww-2.example.com:8080";   

              if (isInNet(myIpAddress(), "10.4.0.0", "255.255.0.0"))   

                  return "PROXY ww-1.example.com:8080; " + "PROXY ww-2.example.com:8080";   

              else   

                  return "DIRECT";

          }

          • 2. How to load balance in .PAC file

            One thing to note when writing proxy.pac files is that you should avoid calling functions or doing dns lookups more than once. Always read it into a variable and call that variable multiple times. The browser never caches the results of functions or dns lookups (fortunately or unfortunately depending on circumstances). I have seen pac files that took 30 seconds to determine a result (each time).

             

            So, in Shawns example (a very very common scenario)

            you would call

            var myip = myipaddress()

            and then reference myip when calling isinnet.

            if (isInNet(myip)...

             

            While Shawn's example is the most common, a few other methods are also used.

             

            This will load-balance/failover without having to know the subnets in advance by splitting based on the last octet

            // Find the 4th octet

            var myip=myIpAddress()

            var ipbits=myip.split(".")

            var myseg=parseInt(ipbits[3])

             

            // Check to see if the 4th octect is even or odd

            if (myseg==Math.floor(myseg/2)*2) {

                 // Even

                 proxy = "PROXY 172.18.0.160:9090; PROXY 172.18.0.159:9090; DIRECT";

            }

            else {

                 // Odd

                 proxy = "PROXY 172.18.0.159:9090; PROXY 172.18.0.160:9090; DIRECT";

            }

             

            The other common option is to simply define a virtual name in DNS and configure the DNS server to round robin the responses.

            • 3. How to load balance in .PAC file
              jont717

              @cnewman

               

              Would it be easier to just use:

               

              // Return a randomly selected proxy list

               

                    if(Math.random() < 0.5)

                    {

                       return "PROXY proxy1:1080 ; proxy2:1080";

                    }

                    else

                    {

                       return "PROXY proxy2:1080 ; proxy1:1080";

                    }

               

              math.random() selects a number from 0 - 1

              • 4. How to load balance in .PAC file
                eelsasser

                ...or...

                Use DNS to round robin both with multiple A records to each server.

                So you will have two DNS entries for "proxy", one for each gateway:

                 

                return "PROXY proxy:8080; proxy1:8080; proxy2:8080"

                • 5. How to load balance in .PAC file

                  @jont717

                  Ha! I enjoyed that and esoteric discussions of javascript random functions aside, I agree that that would work under specific circumstances.

                   

                  However, how would you know which proxy the user went through from request to request? Keep in mind that everytime it goes to do a request (if you disable proxy result caching in IE as you should, or use FF/Chrome/Safari/Opera/etc) or at minimum per domain, it will walk through the pac file logic and possibly pick a new proxy to use. That could get confusing or problematic quickly.

                  1. You will have extra authentications as the user switches back and forth between the proxies on succeeding requests, I would even expect popups as the NTLM session token supplied by the browser could possibly be invalid for that particular proxy (although it's probably supposed to renegotiate on a new connection) 
                  2. No idea which proxy to look at for logs/traces

                   

                  In general, whether you are using WCCP, proxy.pac load balancing or better yet a physical load balancer, if you are doing authentication or troubleshooting, you want the clients to be relatively sticky.

                   

                  Still, cool, I didn't think of that

                   

                  @E-Squared

                  I did mention DNS RR (last), and it works fine, but can take time to fail over/through. Each failing proxy takes time to get through and I'm an impatient person. But having both the virtual name and the actuals is a very good idea as it gives automatic failover.

                   

                  Also, everytime I've seen DNS roundrobin we ended up putting the proxyname in the block page just to figure out which proxy you're using when troubleshooting.

                   

                  --CN

                  • 6. How to load balance in .PAC file
                    jont717

                    I am testing the .pac file with the math.random function.  Seems to work great.  I see myself hitting both proxies very evenly.

                     

                    I understand your concerns, but this is how our WCCP traffic is load balanced anyway.  It hits both proxies at the same time...sending some traffic to one and some to the other for the same user.  I have no control over what gets sent here or there.  You are right, it is a pain with the log files but I got use to checking both logs.  And when they get sent to our Web Reporter they are combined and act as one.

                     

                    Our proxies share the exact same configuration.  They act as one device because they are connected in a central management configuration. 

                     

                    Thanks for your help and usful tips.   Here is my simple .pac file.  Trying to keep it as easy as possible.

                     

                     

                    function FindProxyForURL(aFullURL, aHostname)

                       {

                          // Check for hosts in the same domain as the client

                          if(isPlainHostName(aHostname))

                          {

                             return "DIRECT";

                          } 

                          // Check for hosts in the same IP sub-net

                          if(isInNet(aHostname, "172.16.0.0", "255.255.0.0"))

                          {

                             return "DIRECT";

                          }

                          // Return a randomly selected proxy list

                          if(Math.random() < 0.5)

                          {

                             return "PROXY ed-proxy1:9090 ; ed-proxy2:9090";

                          }

                          else

                          {

                             return "PROXY ed-proxy2:9090 ; ed-proxy1:9090";

                          }

                       }

                    • 7. How to load balance in .PAC file

                      Hmm, well that's not best practices in authenticated environments. Normally MFE advises that you distribute load with WCCP based on just Source IP (not [Source IP + Source port] or Destination IP) if you are doing authentication. The only reason I would ever use anything other than Source IP is in cases where multiple users are coming from one IP (Citrix, Terminal Services, NATs, etc)

                       

                      It's great that it works and it is certainly up to you jont717 but you are more than doubling your authentications (due to NTLM session identifiers) and if you ever open a ticket with support regarding connection or authentication issues you will probably need to ensure that clients are sticky.

                       

                      I can tell you for a fact that we have seen issues with authentication when the clients were not sticky. The last time I saw it was with a physical load balancer but the same theory applies to WCCP and proxy.pac equally.

                       

                      Regards,

                       

                      --CN

                      • 8. How to load balance in .PAC file
                        jont717

                        You have peaked my interest.  I did check our WCCP load balancing options and it is using Source IP + Destination IP. 

                         

                        What is the difference if I just put it on Source IP only?

                        • 9. How to load balance in .PAC file

                          Not all equipment supports all modes, but here goes.

                           

                          Every time a new request comes through the router looks up in it's hash table to determine which cache to send the traffic to. At the moment for you, that cache determination is done via both Source IP and Destination IP which is almost the worst of all worlds as you can't guarantee which cache a specific user will use and can't guarantee which cache all users will use for a specific destination.

                           

                          User A + Site A may go to cache 1

                          User A + Site B may go to cache 2

                           

                          User B + Site A may go to cache 2

                          User B + Site B may go to cache 1 

                           

                          Not ideal. It's not quite the worst option as at least you can keep track of the User/Destination pairs if you wanted to, but still a pain.

                           

                          If you weren't authenticating and you were attempting to save bandwidth (not generally a concern in this day and age) you would probably want to hash just based on destination IP. That way regardless of the user, they would use the same cache which may have already cached that content. This used to be useful but is of limited use with web 2.0 stuff/dynamic sites and certainly looks odd if you are doing auth or troubleshooting.

                           

                          Source IP + Source port is what you want to use if everyone is coming from the same IP address as each new connection would use some arbitrary high port. In this case it will be completely random as to which cache a user will end up using from request to request.

                           

                          Source IP will just split the requests coming in across the caches based on the client's IP. That way assuming they don't change addresses and the cache pool is static, they use the same proxy.

                          It's a good thing for authentication, it's a good thing for troubleshooting and you should consider it.

                           

                          On a side note, you should be able to switch it with almost no interruption of service basically by changing the options for WCCP on the MWGs. However, if you can, I would probably disable WCCP on all MWGs, change the settings and then rejoin with the router(s).

                          1 2 Previous Next