Feb 21, 2010 7:31 AM
I have a huge bandwidth problem on my network which seems to be caused by ePO 4.5. We used ePO 3.6 and did a fresh installation (no upgrade) of 4.5. We also ran into several problems with upgrading the agents on the machine from 3.6 to 4.5 and VSE from 8.0 to 8.7i (P2). First they wouldn't upgrade and now we have many McTray errors. While the support desk said it was a unknown problem, there is now KB67118 (only solution 3 is applicable...nice). But if that wasn't enough, for some reason at the 5th of February around 10:00AM the bandwidth between two locations is saturated with traffic between the clients on one side and the ePO 4.5 server on the other side. Before I go on...here's how our ePO environment is created.
I have one ePO 4.5 server (Master Repository) on location A and a Distributed Repository on Location B (both in Holland). The Global Update function is Enabled and also a task is scheduled once a day to replicate (incremental) the Master Repository to the Distributed Repository. The Distributed Repository resides on a UNC path on a XP-machine, which also functions as a SuperAgent and has a Rogue Detection Sensor for that subnet. There is a dedicated 1Mbit up/down (Business) DSL-line between the two locations and every location has it's own subnet. On Location A there are also the Mail Servers and a few databases which are also used by clients on Location B. Location A has dedicated lines to other Offices in Europe and Canada and some remote administration is done from other locations on Location B. The DSL-line has a average saturation of 65%. Because the company works 24/7, the load is divided almost equal between day and night and never reached 100% for a long period of time.
So in December 2009 we upgraded from 3.6/8.0 tot 4.5 /8.7i. We recreated the old settings in ePO 4.5 (policies, tasks, etc.). ePO does a repository pull every hour. The McAfee Agent 4.5 policy states to get the updates from the repository with the lowest ping time and have the master repository and HTTP als fallback. Distributed repository is online, updated, accessible and has the lowest ping time on Location B. Then on the 5th of February we got complaints from users in Location B that the communication for E-mail was extremely slow and some applications which use databases on Location A stopped working. Our ICT-desk is closed and because this wasn't a business critical issue, they decided to wait until Monday (8th of February) for support on location. On Monday we looked at the log files and saw that for some reason many clients contacted the ePO server. When we shut down the ePO server the traffic stabilized at 45%. When we turned it back on at first it seemed fine, but after 5 to 10 minutes the line was at 100% again. We suspected that clients where getting updates from the Master Repository instead of the Distributed Repository. So we decided to change the default McAfee Agent rule and set the policy for each site. Agents in Location B would now get the updates only from the Distributed Repository and use HTTP as fallback (Master Repository set to disabled) and visa versa for location A.
Unfortunately this doesn't help much. After 10 minutes the line is saturated again. Support from McAfee results in a escalation after 1 week to tier 2. Now the Tier 2 guy takes over the server, looks at the settings and can't seem to find anything wrong with the tasks or policies. After a while he comes back and says the construction as of 4.5 is wrong. We should not use Distributed Repositories anymore on the network as described by us, but we now should use Agent Handlers. This is the new option if you have several locations with a low bandwidth connection between them. I'm not against new features, but when I ask them why al of a sudden Distributed Repositories is not the way to go and why it has functioned for several years (and even at least a full month in the new setup) I do not get a real answer from him. When I asked him if he had ever installed it, he admitted he had never done it...Now this feels more like McAfee doesn't have a answer for my problem and let me do something to see if this might help instead of a real solution, because I still do not know why this worked fine before.
Now from what I read on this forum the Agent Handler functions like a trimmed down ePO server with no repository, only cache after a Agent request. Also the Agent Handler needs a constant connection to the SQL server at the other location. It then does a download for the request of an Agent on the Agent Handler location and caches it. To make this work we need a 2003/2008 server and remove the Distributed Repository and SuperAgent from that subnet. So now I wonder how many resources this takes up on a server, because we only have one File and Print server on location B which also functions as DC. Then I wonder what the clients do when they move from location B to Location A. Do the Agents also automatically know that there is a Agent Handler in their subnet. I really like to know if this is the solution or if I also need to make the Agent Handler a distributed repository. And of course the big question: Will this solve the problem.
If this is realy the new way to go to prevent problems other Offices in Canada and Europe also need to change their strategy, because all others are also configured like ours. So this has a huge impact. I feel like I'm not getting the required answers from Tier 2, so maybe someone on the community knows how to or has experience with Agent Handlers vs Distributed Repositories.
Please advise...thank you in advance for your efforts!
Tags were applied by: U4iA on 2/21/10 2:31:28 PM CET