9 Replies Latest reply on Mar 11, 2010 12:28 PM by jstanley

    Global Update Replication across a busy T1

      Does anybody else have problems with a distributed repository taking forever to complete Global Update Replication (or failing after a couple of hours)?  I've got EPO 4.5 w/ 6 distributed repositories (super agent) at the ends of T1s, but one in particular seems to really have a hard time completing replication.  It's definitely tied to the saturation of the T1, because when it replicates on a weekend (nobody there using the T1) it completes just fine, but when people are here trying to do their jobs, I end up getting the call "your stupid McAfee is using up all the bandwidth".

       

      I haven't been able to figure out if this replication uses any sort of intelligent throttling, like BITS, but since it takes much longer when folks are present, I'm inclined to think it does.  I've killed the replication task a couple of times, at the request of one of our network folks, to see if that helps the T1 utilization.  But the problem I see with that is that then, each and every workstation at that remote site will individually seek out the latest DAT file from my main EPO server.  Even if I configure those PCs to go straight to McAfee, that site's internet traffic comes across the T1 to our gateway, so they aren't really saving anything.

       

      Any ideas?

       

      Thanks,

       

      Brett

        • 1. Re: Global Update Replication across a busy T1
          jstanley

          EPO operates on the application layer of the OSI model so no throttling is enabled. It will replicate to the distributed repository as fast as your network will allow. My guess is that your standard network traffic plus the replication is saturating your WAN.

           

          The solution to this problem will be to disable global updating, chain a repository replication task off your repository pull task and schedule this task to occur after hours. It is important (due to the behavior you noted earlier) to always replicate immediately after you complete a repository pull. Otherwise your client machines will view the distributed repositories as out-of-date and update directly from the EPO server (thus negating the purpose of distributed repositories).

           

          Also if you are not using any legacy products (such as VSE 8.0i or earlier) you can switch to our V2 DAT site which will cut in half the amount of data you are replicating for a DAT update. Simply edit your source repository and add the number "2" to the end of the URL. You should also make sure your repository pull is set to only pull data that you use. For example if you are not using Spam Killer then you have no need to pull spam killer DATS. If your not using HIPS then don't pull HIPS content.

           

          I hope that helps!

          • 2. Re: Global Update Replication across a busy T1

            Thanks for the quick response.  I have an "Update Master Repository" task which runs @ 0100, and a "DAT pull" which checks for DAT updates every 2 hours.  I suspect the latter is where I'm getting into trouble.  However well intentioned, trying to keep up to the latest DAT file seems almost not worth the trouble.  If I disable the "DAT pull", and just let the DAT get updated with the 0100 task, then could I not leave global updating enabled?  The only time the master repository would then get updated is in the middle of the night, and replications complete fine at that time.

             

            I will definitely switch to the V2 DAT site for my source, and see if I can shave off some of the extraneous updates - you're right, we're not using spam killer, so why bother.

            • 3. Re: Global Update Replication across a busy T1
              jstanley

              You are correct as long as your pull does not occur until the middle of the night a global update should not be triggered in the middle of the day. I just prefer chained replication tasks to ones triggered by a global update but really either one should work fine.

               

              Unless you are in the middle of a virus outbreak (in which case having the most up-to-date DAT file becomes essential) I would not worry about being a few hours behind on the DAT file. As long as you are within a day or two of the current DAT release you should be fine.

              1 of 1 people found this helpful
              • 4. Re: Global Update Replication across a busy T1

                Thanks again, I think this will make a real difference.  Just to clarify the V2  source sites, I now have "ftp.nai.com/CommonUpdater2" and "update.nai.com/Products/CommonUpdater2".  Those are correct?

                 

                 

                Message was edited by: WayneBrettski on 12/28/09 2:29:39 PM GMT-06:00
                • 5. Re: Global Update Replication across a busy T1
                  jstanley

                  Yes sir that is correct.

                  1 of 1 people found this helpful
                  • 6. Re: Global Update Replication across a busy T1
                    runcmd

                    Jeremy Stanley >> The solution to this problem will be to disable global updating, chain a repository replication task off your repository pull task and schedule this task to occur after hours. It is important (due to the behavior you noted earlier) to always replicate immediately after you complete a repository pull. Otherwise your client machines will view the distributed repositories as out-of-date and update directly from the EPO server (thus negating the purpose of distributed repositories).

                     

                    I don't recall seeing that piece of information in the manual.  We have several SuperAgent Distributed Repositories and I have Global Updating disabled because of concerns related to network saturation at slower sites.  Do clients check in with the ePO at every update attempt to see if the ePO has newer DATs than the distributed repositories?  I have the master repository check for updates from the source site(s) hourly; however, I have replication to the distributed repositories scheduled to occur once a day during off hours.  I had a case open with support a while ago to determine why my clients would only update from the master repository, rather than from distributed repositories, and this never came up.  After I recreated the distributed repositories and replicated (at the recommendation of the support representative), it worked fine.  I just spot-checked a handful of clients and they, again, appear to be only updating from the master.  Based upon the information provided by Jeremy, I have a suspicion that the root cause here is the fact that my master repository has a DAT newer than my distributed repositories.  It would appear that my only feasible solution will be to change the DAT pull from the source site to also occur only once a day and only right before the replication to distributed repositories.  Thanks for sharing that piece of information, Jeremy!

                    • 7. Re: Global Update Replication across a busy T1
                      jstanley

                      So here is how it works.

                       

                      Inside your master repository you have a file called "sitestat.xml". This file contains exactly 2 pieces of information, a date/time stamp for the last time the repository was updated (the catalog version) and whether or not the repository is "enabled" or "disabled". The repository is "disabled" during a repository pull/replication so clients do not attempt to update from a repository while that repository is being updated.

                       

                      During a standard hourly ASCI (so not an update but rather an agent-to-server communication) one of the pieces of information the clients get from the EPO server is the catalog version of the master repository. Now when a client machine performs an update the first thing it downloads from whatever repository it happened to connect to is the sitestat.xml. It first checks to make sure the repository is enabled and then it compares the catalog version (remember the catalog version is simply a date/time stamp) with the catalog version it obtained from the EPO server during its standard ASCI. If the catalog version on the distributed repository is older than the catalog version of the master repository the client will view the repository as not-up-to-date and move on to the next repository in the list.

                       

                      So because of this you should ALWAYS replicate immediatly after you update your master repository. If you cannot replicate in the middle of the day due to bandwith concerns then you should not pull until after hours (when you can replicate).

                      • 8. Re: Global Update Replication across a busy T1 and local network saturation

                        Here is what I want, and it could solve both issues.

                         

                        The update from the central server should be fragmented and doled out in increments so the WAN has a chance to gulp for air.

                         

                        The local repository should do a coordinated multicast so that only one copy of the file is travelling around the LAN at one time.

                         

                        On a two hour basis, this single stream could be repeated three times to give everyone a chance to gather the entire file.

                         

                        Requests from stragglers could be satisfied at half hour increments, but there would never be more than one copy of the file active on the network at one time.

                         

                         

                        Message was edited by: InterociterOperator on 3/4/10 8:33:20 PM CST
                        • 9. Re: Global Update Replication across a busy T1 and local network saturation
                          jstanley

                          If I understand the issue correctly essentially you don't want more than 1 repository to get replicated to at the same time. The solution is to chain multiple replication tasks each one only replicating to one repository. By design when you chain a task ePO will not continue on to the next task in the chain until the previous task in the chain completes. Be careful with this though because also by design if one task in the chain fails ePO will abort the task (and not continue on to the next task in the chain). So this means if a replication to repositoryA fails the chain will not continue therefore ePO will not even attempt to replicate to repositoryB or C...etc.