3 Replies Latest reply on Sep 13, 2011 6:37 PM by NetTas

    Cache and Performance problems

    NetTas

      We have 3 WebGateway appliances (7.1.0.3) are using a default cache on partition  /opt/mwg/cache for each appliance. The partition size is 750 Gigs, and the cache grew to some 520 Gigs with 20 million objects. Web Cache Usage has plateaued at 70% for some weeks now.

      Of recent we have observed when the MWG services are restarted (either hard or soft), then for a period of 5-10 minutes the general response time performance for Internet users is extremely slow. If I disable the Web Cache Rule before MWG service performance is good, but when I enable Web Cache Rule anytime after -  performance experiences a slow period.

      I observe during this 5-10 minute period that the Load Average ( via top command on console) exceeds 80 ( appliance has 16 CPUs ). Normally during busy times with 50,000 request per minute then Load Average is hovers around 10-12.

       

      As a test I had cleared the cache on 1 appliance and resized the /opt/mwg/cache partition to 70 Gigs - I am assuming that Usage will grow to 70% ( thus approx. 50 Gigs ), then will test the MWG service restart.

       

      Has anyone else had similar experiences ?

       

      Does anyone have any cache management tools that can be applied to Web Gateway 7 appliance caches ( apart for Flush Cache) ?

       

      Message was edited by: NetTas on 9/6/11 8:16:55 PM CDT
        • 1. Re: Cache and Performance problems
          asabban

          Hello,

           

          we don´t have many features to control what is within the cache, apart from flushing it completely.

           

          I think when you restart the process there is some work being applied to the cached files in regards to building an index etc., which is pretty CPU intensive - I assume this is working as expected, but certainly it would make sense to not eat up all resources while working with the cache, but maybe take longer for doing the cache related tasks (with limited resource utilization), and probably bypass the cache until it is ready (or something similar to this approach).

           

          Just for our information, did you file an SR with support on this topic? It would be interesting if there is maybe something within the cache that would cause this behavior and support may be the right guys to investigate.

           

          I have not heard similar complaints, but for me it sounds logical in some way. I would be interested as well if others have seen this behavior.

           

          Thanks for sharing.

          Andre

          • 2. Re: Cache and Performance problems
            tabrams

            I have a customer with a similar problem. When a user intially open their browser and goes to Google it will take 15-20 seconds to load. After the intial wait performance is fine unless they say off the Internet for awhile then it will reappear once they open their browser. They do have a support ticket open.

            • 3. Re: Cache and Performance problems
              NetTas

               

              After re-flushing the cache, we had no performance issues afterrestarting MWG services.

              We have re-sized the cache partition to 70 Gigs and now the cachetops out at 50 Gigs ( I am assuming 70%  of available partitionspace is configured ) - I have restarted the MWG services when cacheis full and there is no performance issues observed.

              Additionally the Cache Efficiency is 21% Hit - which is acceptableand comparable when the cache size was larger.

              We have concluded that the original partition size allocated to/opt/mwg/cache was excessive, and considering I am not able tore-size the cache size in WG7, the only option was to reducepartition size. Additionally it would be advantageous if we were ableto apply additional cache configuration settings such as cachemin/max object size etc.

              Please note the problem we were initially experiencing was thatafter the MWG services were re-started, it appears that the cachestart-up process ( indexing etc ) only commences immediately afterthe cache is queried with internet requests.

              I have documented the process to resize the cache partition for our appliances - if you need assistance - please ask

              Geevesie