I've been trying to find recommendations when using a load-balancer in front of MWGs (not using the internal MWG LB mecanism).
Particularly around these settings:
Any feedback appreciated.
unfortunately I am not very good with load balancers so I probably miss some points but from my experience in the past the most important fact is persistency. For some MWG features (Progress Pages, Quotas, Coaching, PD Storage) it is important that a client keeps talking to the same MWG node. PD Storage (which also holds the quota information) is not synchronized live in a cluster but only every few minutes, and if a client hops between MWGs you may accept a quota session on Node A and Node B does not (yet) know about this fact.
Similar happens for progress pages. If you download an ISO only the MWG where you started the download knows about this. The progress pages do not use a persistent connection between client or MWG (such as Data Trickling does), but every 3 seconds the Progress Page in the browser contacts MWG and asks for an update.
If your client is talking to a virtual IP and a different MWG is asked it does not know about the download, so it will cancel.
From my point of view Round Robin is fine, but persistency should be configured. If the load balancer is smart enough it is probably able to collect some performance metrics and ensure all traffic is equally balanced across all MWG nodes, otherwise "stupid" round robin should also bring acceptable results.
Persistency is important as mentioned. Source IP is OK. Some Load Balancers are even able to insert a cookie into the traffic, so that this also works behind a NAT or on Terminal Servers, but generally Source IP should be fine. If the load balancer has the capability it could add an X-Forwarded-For header into the HTTP requests and HTTPS CONNECT request, so MWG could log the original client IP. If a client is not seen for X minutes (I assume this is the persistency timeout)... by default on hour HA implementation we use 5 minutes I think, this should be an acceptable value.
In regards to potential failure of boxes I would recommend using a number X of boxes for production traffic (lets say 4 MWG appliances). I would always use the same 4 machines and keep one or two as spare devices, so when one machine of the pool dies, it gets replaced by one from the pool and all users switch from the failed machine to a new one. By doing so you do not impact the other users at all since they keep talking to their previous machine and also the load stays the same.
I hope this gives a start for considerations 🙂
Thanks for the feedback. I alos opened a service call to get this information and was told to basically follow best practices from the LB vendor for web proxy. This is what we are going to do and see how it goes.
One more question around this: We have 4 MWG in transparent router mode (we are currently in the progress of migrating users to use the MWG as explicit proxy). I noticed that the only way for them to transparently route traffic was to configure them both in director mode (one with higher priority).
Once we have completely migrated all our users to use the MWGs as explicit proxy, should I leave all nodes as directors or should I set them to scanning mode only to get rid of the loadbalancing layer on the MWG? (Load Balancing is accomplished in front of the MWGs, so we don't need the MWGs to do LB)
actually it is quite hard to get a good recommendation from the support folks or even people like me. Most of us deal with MWG and MWG related issues only, planning and deployment are usually tasks other people do at McAfee. Especially the SE or PS guys who visit customers and plan/perform the integration. For other people (including myself) the load balancer is not more than a black box which can do some magic to feed our product with requests 😉 Anyway during the years you learn about features and products that exists and how they can integrate with MWG.
However it seems you got a plan - which is good. Any you decided to move on to explicit proxy... Good idea 🙂
In transparent router mode you need the director to provide the virtual IP address. So if you have 4 nodes the setup may look like this:
Node-A, Director with additional VIP 192.168.0.200... the routing path points usually to this virtual IP address so provide transparent router functionality.
If I understood correctly you are now more or less migrating away from transparent router to explicit proxy with a load balancer. So initially I would set up the load balancer to point only to 192.168.0.100-103, so the "real" IP addresss of your MWGs. The load balancer will most likely introduce a new virtual IP address which is then used by the end users, maybe 192.168.0.250.
Until all users are migrated I would simply leave the transparent router up and running, as long as it is possible within your network. If you are going to change the routing you can ignore this.
Once the users point to the load balancer and noone uses the transparent environment any longer I would switch all the 4 MWGs from "Transparent Router" to "Proxy" in the Configuration -> Proxies tab. This will eliminate all the director node, scanning node, virtual IP, routing etc. setup, which is obsolete when you use explicit proxy.
You CAN do both, e.g. use MWG as explicit proxy and leave the transparent environment up and running BUT I would only lave both in place as long as needed. For troubleshooting it is much easier if you only use one deployment rather than having both up and working. If someone complains about a problem you will start thinking about what path the requests came into MWG, is it a problem with the routing or the load balancer, etc. Keep it simple 🙂
Hi, Here in my company we use BIG IP(f5), and use load balance with cookie and fallback to source persistence(Its a feature of this load balance).
Its working very well. The troughput of the internet are better than source persistence and better without proxy... really....
We don´t use progress page and quota implementation, and NAT address for all proxies is with the same ip address.
To improve the performance, we use kerberos for autentication with fallback for NTLM (kerberos use a ticket to autenticate and is faster than NTLM and more secure).
Version used is 220.127.116.11.
We have 20000 users using this environment and is working fine.
Be aware that load balancers can be configured to actually alter the HTML headers, which can result in (very) hard to find problems. For example, the F5 BigIP strips the contents of the "Accept-Encoding" header by default. F5's have several other options that will inject, delete, or otherwise modify header information.
Not a big deal, usually. What you'll see is a web site that doesn't quite work right and MWG isn't blocking anything or giving any information that indicates any problems. Support tracked it down for me after examining both client side and server side packet traces, ruleset traces, and feedbacks.
The F5 is injecting the client's IP into the HTTP X-Forwarded-For (XFF) field...well, most of the time. In the process of resolving an issue, we've found that the MWG Connection Tracing depends on the client's source IP as part of the IP header and doesn't look at XFF. We're unable to filter a single IP address for connection tracing because all of the connections appear to be coming from the F5. We're having a really tough time figuring out how to configure the F5 to pass true source IP to the MWG.
There are some connections where XFF isn't injected and the connection has no client IP at all. We can see this in both the access log and packet traces. Coincidentally, there are issues with the sites in these connections.