cancel
Showing results for 
Search instead for 
Did you mean: 

Incident Response: Support Recommendations

Introduction

This document is complimentary to McAfee Web Gateway support's Incident Notifications ruleset.

ID: 5

Description:

This is a test incident used for Monitoring. This incident occurs every minute.

Message:

Monitor Incident (7): Monitor Incident

Action:

Ignore this email, it should only be used for testing. You need to disable "Test Notifications" in the Incident Notification ruleset.

ID: 20

Description:

RAID monitoring reported critical status or failure of one or more hard disks.

Message:

health monitor (3): RAID reports 0 critical disks, 1 failed disks, and 1 degraded virtual drives. Disk Info: Physical Slot: 4, Device Id : 3

Action:

    1. Determine if drive is truly failed or needs to be rebuilt:
    2. If drive is failed gather the necessary hardware logs as per . Open a Support ticket, please include RMA information (contact and shipping details), and the alert message you received.

ID: 22

Description:

File system usage has exceeded a configured limit.

Message:

health monitor (4): Filesystem usage on /opt exceeds selected limit (91% / 90%).

Action:

Identify files which are filling disk, attempt to disable any debug logging that was enabled erroneously, and delete any unnecessary files.

    1. Use the steps outlined in article.
    2. Once the problem is identified and solved, you may consider resizing the disk.

ID: 24

Description:

System load has exceeded a configured limit.

Message:

health monitor (4): 5 minute load average exceeds selected limit (3.35 / 3.00).

Action:

System load alerts alone, are not always indicative of a problem. Other performance metrics should also be taken into account.

    1. Identify if there is user impact. An easy test is to test web browsing performance through the proxy. Was there slowness during the time of the alerts?
    2. Is this an appliance or a VM? Appliances will have dedicated CPUs versus a VM based deployment which may share resources with other virtual hosts.
    3. Identify what process is using most CPU load using the top command. Usually this will be the mwg-antimalware or mwg-core process.
      • mwg-antimalware - run the command below to see what is currently being scanned:

/opt/mwg/bin/mwg-antimalware -S threads






      • mwg-core - there are many possibilities it is best to consult your McAfee Technical Support professional if you are seeing user impact as a result of this alert. Please include a feedback file taken during the problem state.

ID: 26

Description:

A check has been executed to detect a BBU RAID error. (Only for Intel based appliances)

Message:

health monitor (4): RAID BBU check reports remaining capacity of battery as low.

health monitor (4): RAID BBU check reports the requirement of battery replacement.

Action:

If battery is low and appliance was recently powered on, please monitor for 24 hours. If the low battery message persists or the message indicates failure, gather the necessary hardware logs as per , open a Support ticket. Please also include RMA information (contact and shipping details) and the alert message you received.

ID: 501

Description:

Log File Manager failed to push log files.

Message:

Log File Manager (3): "Cannot push '/opt/mwg/log/user-defined-logs/access.log/access1505182209.log' to 'ftp://myFtpServer:9121/access1505182209-10.10.69.72.log'"

Action:

If log files fail to push, deletion of those log files cannot occur. If action is not taken the disk could fill.

Most log push failures occur due to network or log server issues:

    1. Identify which log is failing to push (see notification).
    2. Investigate why the log push is failing. Consult the mwg-logmanager.errors.log (Troubleshooting > Log Files > mwg-errors).
      • Search the log for the file indicated in the alert that failed to push. Look for the error message

# DNS problem


Error output is 'curl: (6) Couldn't resolve host 'myFtpServer''






      • If using Web Reporter or Content Security Reporter, verify it is running and reachable from Web Gateway.

ID: 700

Description:

Web Gateway entered overload state

Message:

Proxy (2): Overload: Connection limit of 25000 simultaneous connections has been exceeded. Delaying accepts

Action:

Web Gateway is overloaded.

ID: 701

Description:

Web Gateway had entered an overload state, and is still in an overload state even after delaying accepting connections

Message:

Proxy (2): Overload: The Webgateway is still overloaded and delays accepts

Action:

Web Gateway is overloaded and issue should be addressed immediately.

ID: 702

Description:

Web Gateway left an overload state

Message:

Proxy (4): Overload: Left overload handling. Accepts will be done immediately again

Action:

No action should be required unless Web Gateway is going back and forth from overloaded to not overloaded.

ID: 901

Description:

The appliance is connected to n servers for NTLM authentication in Windows domain x.

Message:

Authentication (6): Connected to 1 server(s) in domain x

Action:

There is no action required for this incident. The Web Gateway successfully connected to the domain. If you recently received an alert that the MWG was disconnected from the domain, it's useful to know that MWG was able to re-connect.

Related IDs:

902,903

ID: 902

Description:

The appliance could not connect to n servers for NTLM authentication in Windows domain x.

Message:

Authentication (3): Failed to connect to DC dc1.mcafee.com in domain mcafee.com

Action:

Depending on your available domain controllers, this may or may not be an issue.

    1. Verify how many DC's the Web Gateway is using (Configuration > Windows Domain Membership). If more than one DC is configured, Web Gateway should communicate with the active DCs so users should remain unaffected.
    2. Verify with your Domain Admins that the DC was not taken offline for maintenance.
    3. If there is continued user impact, replicate the problem with a test client while the authentication debug log and tcpdump are running, see the troubleshooting steps outlined in the NTLM Best Practices. This data can be provided to McAfee Technical Support for further analysis. Along with the data described above please include client IP, observations, and a feedback file (includes authentication debug log).

ID: 903

Description:

The appliance could not contact Windows domain x for NTLM authentication.

Message:

Authentication (3): The following domain(s) can't be contacted: mcafee.com

Action:

This alert requires action as it signifies that no configured DCs are reachable for a given domain. Users required to authenticate with this domain, will be impacted.

    1. Verify how many DC's the Web Gateway is using (Configuration > Windows Domain Membership). If only one DC is configured, more should be added if available.
    2. If there is continued user impact, replicate the problem with a test client while a tcpdump is running, see the troubleshooting steps outlined in the NTLM Best Practices. This data can be provided to McAfee Technical Support for further analysis. Along with the data described above please include client IP, observations, and a feedback file.
Version history
Revision #:
1 of 1
Last update:
‎04-26-2015 08:31 AM
Updated by:
 

Community Help Hub

    New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.

  • Find Forum FAQs
  • Learn How to Earn Badges
  • Ask for Help
Go to Community Help

Join the Community

    Thousands of customers use the McAfee Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership:

  • Get helpful solutions from McAfee experts.
  • Stay connected to product conversations that matter to you.
  • Participate in product groups led by McAfee employees.
Join the Community
Join the Community