3 Replies Latest reply on Oct 25, 2011 9:09 AM by JoeBidgood

    Lazy Cache expiration.

      I saw somewhere in the documentation on ePO 4.6 references to the local cache expiring after a set period.  I am trying to get my hands around what the implications are for this.

       

      I have a cache at a remote location.  It has both DAT's and product deployment files.   Say the cache is set to expire after 4 hours.

       

      The first client needs just an incremental DAT,  do the local repository pulls it from master and forwards

      The second client needs a full DAT update, it has been off line for 40 days or is newly built.  The repository reaches out and pull down the full 100MB+ DAT file and forwards.

       

      6 hours later client three also is newly built out.   Will the repository just forward the  full DAT it downloaded 6 hours ago or will it pull a full new DAT  file of 100 MB.

       

      Now if I have clients 4 and 5  assume they are newly built machines on separate days.  The imaging process only installs the McAfee Agent.  All other products are install via ePO configuration.  Assume they both need VSE.   Client 4 causes the local repository to pull the whole VSE install package local and forwards to client 4.   Now a day later client 5 comes along.  Are we going to pull the entire VSE install back again.

       

      The local cache needs to expire if and when the master repository changes FOR THAT ITEM.  Not before and not much later.

      VSE content expires daily

      HIPS content expires monthly

      Product content expires on a random release cycle.

       

      Help me understand.

       

      Thanks

       

      Herb Smith

        • 1. Re: Lazy Cache expiration.
          JoeBidgood
          Will the repository just forward the  full DAT it downloaded 6 hours ago or will it pull a full new DAT  file of 100 MB.

           

          Assuming there is no new DAT in the master repo, it will forward the DAT it already has.

           

           

          Are we going to pull the entire VSE install back again.

           

          Again, assuming there is no new VSE package in the master, it will forward the copy it has locally. The important thing to remember here is that flushing the cache doesn't flush the local filesystem.

           

          Let's take your first case, where after 6 hours another machine asks for the full DAT package. When the process starts the SA repo doesn't know that it's going to request that file: the first thing it asks for is the sitestat.xml.

          The SA repo checks the current time against the last time it downloaded sitestat.xml from the master, sees that they are more than 4 hours apart - i.e. the cache flush interval has expired - and requests sitestat.xml from the server again.

          It now checks the version info in the new sitestat against its current version: they are the same - so the content of the master repository has not changed. It therefore knows that the local files it has are still valid.

          The client machine now requests the full DAT. The SA checks the cache, finds that it has the requested file, and forwards it to the client.

           

          Now let's take the second case. Client 5 has been built and connects to the SA repo, and requests sitestat.xml.

          As before, the cache flush interval has expired, so the SA repo requests sitestat.xml from the master. This time however the version information has changed - a day has passed and a new DAT has been checked into the master - so the SA repo flushes the cache. At this point the SA doesn't know what has changed in the master, only that something has.

          Now the client machine asks for the VSE install package. The SA repo asks the master repo for a hash of the requested file, and then checks to see if it has a local copy of the same file. It does: so it now calculates the hash of the local copy, and compares this with the hash from the master. They are the same, so the SA knows that this file has not changed: it therefore forwards the local file to the client.

           

          Hopefully that hasn't made things worse

           

          Regards -

           

          Joe

          1 of 1 people found this helpful
          • 2. Re: Lazy Cache expiration.

            This helps considerably.  

             

            To repap:   Expired cache means for the SA to check the hash of the local SA copy and the master repository copy.    It does the checking by pulling the sitestat.xml file from the master and using the hashes in it to compare.   No new files are copied to the local SA repo unless they are different than the master.   Once it pulls the sitestat.xml file it does no further checking for the cache flush interval.

             

            Question:  What is the recommended cache flush interval?   Setting this too long could slow down daily DAT's getting out there.  Set too short and you end up with excess traffic.

             

            Another question:  Any practial limits on the number of SA's.   We have about a 1000 store locations with about 15 to 20 devices per location.   It seems that putting an SA in each location could be win for us. 

             

            Thanks

             

            Herb

            • 3. Re: Lazy Cache expiration.
              JoeBidgood

              HerbSmith wrote:

               

              This helps considerably.  

               

              To repap:   Expired cache means for the SA to check the hash of the local SA copy and the master repository copy.  

               

              Not quite   To take this a step at a time:

              When a client requests sitestat.xml from the SA, if the period between the 'time of request' and the 'last time sitestat was requested by the SA from the master' is greater than the limit specified in the policy, then the SA will request sitestat.xml from the master again. (It doesn't compare hashes for sitestat.xml at any point.) 

               

              It does the checking by pulling the sitestat.xml file from the master and using the hashes in it to compare.  

               

              Sitestat doesn't contain hashes - all it contains is the version information of the repository, and a flag to inficate whether that repo is enabled or disabled. In this situation it compares the version information in the copy of sitestat that it already has with the version information in the newly-received sitestat: if they are different, by definition the master repository contents have changed since the last time the SA requested sitestat from the master.

               

              No new files are copied to the local SA repo unless they are different than the master.  

               

              Correct - either that, or unless they have been requested by a client and are not present in the local filesystem.

               

              Once it pulls the sitestat.xml file it does no further checking for the cache flush interval.

               

              Nope - every time a client requests sitestat.xml from the SA, it will perform the comparison between 'time of request' and the 'last time sitestat was requested by the SA from the master'. When this period exceeds the limit in the policy the cache is flushed and the cycle repeats.

               

              Question:  What is the recommended cache flush interval?   Setting this too long could slow down daily DAT's getting out there.  Set too short and you end up with excess traffic.

               

              The default interval is 30 minutes, which seems reasonable to me. Assuming nothing has changed in the master then the additional traffic generated after a cache flush is minimal - a copy of sitestat, plush hash information for each file requested after the flush.

               

              Another question:  Any practial limits on the number of SA's.   We have about a 1000 store locations with about 15 to 20 devices per location.   It seems that putting an SA in each location could be win for us.

               

              There are definitely some operational limits - regardless of how many clients are using a repo, that repo has to be included in the sitelist.xml and also included in any calculations the clients perform to work out which repo to use. With only 15 or 20 machines per location, in my opinion putting an SA on each site would be a bad move. With that few machines you're unlikely to be bringing new machines online each day, or have machines requesting the full DAT on a regular basis, so the very great majority of day-to-day traffic is going to be incremental update files. You'd be better served by having a smaller number of SA repos at regional centres, with the store machines pulling their updates over the WAN. You might have slightly more regional-centre-to-store WAN traffic, but in return you'd get a much more manageable repository structure.

               

              HTH -

               

              Joe