4 Replies Latest reply on Feb 6, 2014 5:54 PM by msiemens

    Webgateway v6.9.6 filling /var partition

    msiemens

      My apologies if this post is rather long but the background is very important critical to anyone that is still using v6.9.6. We're working on v7 migration but still on v6.9.6.

       

       

      Because v6 was EOL on 31Dec2013, I performed the upgrade to v6.9.6 on 27Dec2013. Here are the messages from the update log:

       

      [27/Dec/2013:14:24:33 PST] =============================================================================

      [27/Dec/2013:14:24:33 PST] Install      1 Package(s)        

      [27/Dec/2013:14:24:33 PST] Update       5 Package(s)        

      [27/Dec/2013:14:24:33 PST] Remove       0 Package(s)        

      [27/Dec/2013:14:24:33 PST] Total download size: 47 M

      [27/Dec/2013:14:24:33 PST]

      [27/Dec/2013:14:24:33 PST] Installed: rsyslog.i586 0:2.0.6-6.cg4

      [27/Dec/2013:14:24:33 PST] Updated: cglinux-release.noarch 0:5-2.8 curl.i586 0:7.17.1-cg2 libcurl.i586 0:7.17.1-cg2 webwasher-csm.i386 0:6.9.6-15512 wwapp-release.noarch 0:5.2.7-cg3

      [27/Dec/2013:14:24:33 PST] Replaced: sysklogd.i586 0:1.4.1-cg2

      [27/Dec/2013:14:24:33 PST] exit=0

       

      Recently, I started encountering an issue where the v6 gateways weren't servicing connections. This was often seen by users as an error "No ICAP Server Available". Investigation seemed to show nothing wrong except that the /var partition was 100% used and the size of /var/log/messages was 0.

       

      [root@<servername> ~]# df -h

      Filesystem            Size  Used Avail Use% Mounted on

      /dev/sda1             2.0G  456M  1.4G  25% /

      /dev/sda2             2.0G  2.0G     0 100% /var

      /dev/sda6             2.0G   40M  1.8G   3% /tmp

      /dev/sda7             267G  133G  122G  53% /opt

      none                  2.0G  312K  2.0G   1% /dev/shm

      tmpfs                 2.0G   80K  2.0G   1% /opt/webwasher-csm/mcache

       

      [root@<servername> ~]# ll /var/log/messages*

      -rw--w---- 1 root    syslogd        0 Jan 31 04:02 messages

      -rw--w---- 1 root    syslogd 15275727 Jan 31 04:01 messages.1.gz

      -rw--w---- 1 root    syslogd 50107380 Dec 28 04:01 messages.2.gz

      -rw--w---- 1 root    syslogd 57738957 Dec 27 04:02 messages.3.gz

      -rw--w---- 1 root    syslogd 11319549 Dec 25 04:02 messages.4.gz

      -rw--w---- 1 root    syslogd 52641796 Dec 24 04:02 messages.5.gz

      -rw--w---- 1 root    syslogd  9592905 Dec 23 04:02 messages.6.gz

      -rw--w---- 1 root    syslogd 11743991 Dec 22 04:02 messages.7.gz


      (note that I had previously modified the logrotate configuration to compress some of the log files after rotation)

       

      I was finally able to get this under control by restarting the rsyslog service but a couple of days later it happened again. I began to suspect that something was wrong with the log rotation.

       

      After more investigation, I found the following:

       

      [root@<servername> logrotate.d]# logrotate -d /etc/logrotate.conf

      reading config file /etc/logrotate.conf

      including /etc/logrotate.d

      Ignoring syslog.rpmnew, because of .rpmnew ending

       

      This meant that the /var/log/messages was being rotated using the /etc/logrotate.d/messages script:

       

      /var/log/messages {

          create 620 root syslogd

          size 50M

          sharedscripts

          rotate 8

          compress

          postrotate

              test -f /var/run/syslogd.pid && kill -HUP `cat /var/run/syslogd.pid`

          endscript

      }

       

       

      Here's the problem. After the update, /var/run/syslogd.pid no longer exists! Each time this script runs, if the file is rotated, the rsyslog service is never restarted!.

       

      The correct pid file is /var/run/rsyslogd.pid. Not wanting to spend time sorting through all of the log rotation scripts, I simply modified the messages script:

       

      /var/log/messages {

          create 620 root syslogd

          size 50M

          sharedscripts

          rotate 8

          compress

          postrotate

              test -f /var/run/syslogd.pid && kill -HUP `cat /var/run/syslogd.pid`

              test -f /var/run/rsyslogd.pid && kill -HUP `cat /var/run/rsyslogd.pid`

          endscript

      }

       

       

      This will result in the logs being successfully rotated and the rsyslog service being restarted.


       

      Message was edited by: msiemens on 2/6/14 2:19:29 PM CST
        • 1. Re: Webgateway v6.9.6 filling /var partition
          Jon Scholten

          Hi Msiemens!

           

          Why are the logs filling the drive first place though? Isn't that the real question?

           

          Are you writing access log data to syslog, it wouldn't be a good idea to keep that data on MWG (in /var/log/messages).

           

          Best,

          Jon

          • 2. Re: Webgateway v6.9.6 filling /var partition
            msiemens

            Access log data isn't written to syslog. /var/log/messages has LOTS of reputation score information. The logs are set to roll over at 50M and compress, but when the rotation occurs and rsyslogd isn't restarted, the messages file size is 0 and (as far as I can tell) the log entries go to a temp file on /var that simply keeps growing. The temp file appears to consume the disk space.

             

            BTW, our reputation settings are:

            • Bad:Block&log
            • 50:Neutral:Bad threshold
            • Neutral:Log
            • 15:Good/Neutral threshold
            • Good:Allow.

             

            We don't use the reputation score data very often. I'm going to stop logging "Neutral" reputation and see if I can slow down the messages log.

             

            Mike

            • 3. Re: Webgateway v6.9.6 filling /var partition
              Jon Scholten

              I would just stop it from writing to messages altogether if it's filling the disk. You probably arent doing anything with the messages file anyways. You have the data in the access log if you need it.

               

              Best,

              Jon

              • 4. Re: Webgateway v6.9.6 filling /var partition
                msiemens

                We have been logging heavily, but the log rotation and compression should have taken care of that. This happened to us rather quickly because we are logging so many reputation scores. For those not logging so heavily, it will simply take longer.

                 

                Mike