In many SIEM environments, it's helpful to implement a set of processes that can be used to monitor quality of service, and to provide early visibility in the event that problems should arise. This document will provide several examples of important dashboards, and is designed to track performance of an environment using several methods over time to ensure a high quality of service.
Specifically, we will implement two dashboards that show this information. The first, QoS Dashboard, focuses on event rates and denial of service (DOS) events:
Dashboard 1: QoS Dashboard
A second view, QoS User Activity Dashboard, will be used to view activities by users including:
The following are prerequisites to accomplish this use case.
This use case has been designed to work well with a wide variety of inputs. Suggested data sources include:
The SIEM must have access to query Active Directory. This requires AD to be set up in the Asset Manager and also for there to be access to query AD to build a Dynamic Watchlist that contains accounts that are in Lockout status. Information on how to use the Asset Manager to connect to AD can be found here: SIEM Foundations: Connecting the SIEM to a Windows Domain Controller for Asset Import
With prerequisites properly defined and configured, the next step is to begin the build out of our use case. The sections below will step you through the process of importing the necessary content and configuring the McAfee SIEM.
First we will import predefined dashboards, which will serve as the basis for our use cases. Once imported, we will customize and tune them to meet our specific needs.
Import QoS Dashboard
To import the QoS Dashboard:
Import QoS User Activity Dashboard
Repeat steps 1–6 above for the file named High QoS – QoS User Activity Dashboard.vpx
Customize QoS Dashboard
The QoS dashboard includes 4 dials, each of which may be tied to a separate device or set of data sources. The queries behind these dials will need to be modified to reflect the devices in your environment. To customize the QoS Dashboard:
In this use case we will leverage the Privileged Users variable which is predefined in the SIEM and can be found in the Policy Editor (Variable/application/PRIVILEGED_USERS).
This variable can be modified to match the privileged accounts in your environment. Examine the existing values, and add/remove account names as appropriate. Good examples: Windows Domain Administrators, Local Windows Administrator accounts, user accounts with Linux root privileges, sys or sa level users in databases, etc.
Next we'll build a series of watchlists, which will be incorporated into our views, reports and correlation rules.
Dynamic Watchlist QoS Locked Accounts
We will create a watchlist that queries Active Directory and populates a Dynamic Watchlist of accounts which are currently in a locked state. This watchlist will be used in a variety of ways. As an example this watchlist can be used as a filter on events to show today's events related to accounts which have been locked out. This can be useful in determining if a particular user has locked their account due to multiple password failures, or to show the events around multiple account lockouts to determine if there is a pattern of suspicious activity.
To Create a Dynamic Watchlist to query AD for Locked Accounts:
The QoS Locked Accounts watchlist will now be updated every day at the configured time (midnight GMT by default) and can be used as a filter for events, in correlation rules, and anywhere a grouping of Source User entities is available. We will use it in a custom correlation rule defined below.
Our use case will leverage two custom correlation rules in order to identify relevant behaviors.
QoS – Unusual Activity after multiple Account Lockouts
This correlation rule identifies unusual activity after an account lockout. This rule is triggered by a pair of conditions. First, the rule waits for either:
The rule ultimately triggers when the SIEM sees a high severity event within 10 minutes of one of the above, with the same destination IP.
To import the "Unusual activity after multiple Account Lockouts" correlation rule:
The rule is displayed below:
The intent of the rule is to detect when an attacker is attempting to access the environment and is locked out but is eventually successful in breaching a system. The account lockouts highlight the initial brute-force attack while the SIEM logic provides the analysis for suspicious activity after the lockouts occur. This rule leverages the analytics and normalization capabilities of the SIEM, as well as the Dynamic Watchlist capability and can be part of a standard SOC implementation.
QoS Multiple Configuration Change Failures
We will also create a 2nd correlation rule designed to detect 5 failed configuration changes within 5 minutes that originate from a specific IP. The intent of this rule is to detect if an attacker is attempting to change configurations of various system(s) in the environment and alert based on the activity.
Repeat steps 1–6 above with the file "QoS – Multiple Configuration Change Failures.xml."
The QoS Dashboard provides a single dashboard to monitor daily operations for abnormal activity that could affect the quality of service in the environment.
The dial components for EPS show activity that is significantly above baseline. When high levels of activity are identified, it would be wise to investigate the Event Distribution graph to dig down into the cause of the deviation. Typically anything greater than an additional 15 – 30% could be an indicator of an issue in the environment. In the example below, the gauge for Receiver 2 shows exceptionally high EPS rates compared to the baseline, and warrants some investigation:
When this occurs it shows a data source (or data sources) generating a significantly greater than normal amount of data. This could be due to:
In any of these scenarios the likely source of the increased event activity can be identified by digging down into the Event Distribution pane. First select the "spike" in event activity by clicking on it:
Then drill down into the events using the menus on the Distribution pane:
Once the Events window appears, sort by the number of events by clicking on the Event Count column:
The event(s) with the highest count are the ones which are likely causing a disruption to service due to their volume. At this point the source of the events can be identified (server, network device, etc.) and remediation action taken.
Additionally as part of the initial QoS dashboard, there is a DoS Events pane for detecting attacks which could cause a disruption in service.
The SIEM uses correlation logic to detect DoS/DDoS attack events. DoS/DDoS events can be a symptom of poor network performance and/or excessive flow data being generated. The DoS Events pane shows when there is a correlation rule detection using McAfee published rules. Should there be rule hits on DoS/DDoS rules, action should be considered to mitigate the attack. .
In addition to the visual indication of the DoS correlation rule triggers, the view also includes the source IP addresses where the DoS attacks are originating from in the DoS Source IPs pane.
McAfee's Security Connected Platform provides the SIEM with the ability to interact controls such as network and endpoint security countermeasures, and automate many types of responses. In this example, we'll demonstrate using McAfee NSM to take manual action and block the attack activity via a blacklist:
The relevant host (220.127.116.11 in the example) is now blacklisted without having left the SIEM console. This will provide protection against the DoS attack activity detected in ESM.
Many activities by users and administrators can cause a disruption of service. The second dashboard, QoS User Activity Dashboard, contains several panes.
The Account Lockout pane indicates which Active Directory accounts have been locked out during the selected time period. Summarize on the source user to show other events related to the lockout.
Account lockouts are likely to be caused by one of two scenarios:
In the first case, Summarize by source user and determine what other events are associated with the account. To do this use the Summarize feature to provide a summary by the user in question to determine other event activity associated with the account. If there is no indication of there being unusual activity then it may be a good idea to proactively contact the user and determine if they have locked themselves out.
If there appears to be suspicious activity related to the account lockout then additional investigation is warranted.
Also included as part of the QoS User Activity Dashboard is the Configuration Change Summary pane for baselined configuration changes.
Errors in configuration changes can cause disruptions to service. Firewall rule changes can mistakenly prevent inbound or outbound access, aggressive IPS policies can prevent communication and changes to devices such as routers can render networks unavailable. The Configuration Change Summary pane tracks configuration changes and can be used as a forensics tool when there is an issue reported by users. It can also be used to proactively watch configuration changes, compare them against the calculated baseline, and determine if there is either a larger than typical number of changes or if there are changes in unusual or unexpected areas of the company.
On the QoS User Activity Dashboard you will also find the Most Active Privileged Accounts pane for monitoring activities by accounts with privilege. Activity from these accounts should be closely tracked. Actions from these sources, either by mistake or through malicious intent, could present major disruptions to service. The panes are configured to track the events associated with privileged users via the Most Active Privileged Accounts pane. In addition, the events are ranked by criticality level (calculated by values associated with the type of observed events) in the Average Privileged User Activity Severity pane:
These panes should be viewed on a regular basis and if there is either:
then further investigation should be undertaken to determine if there is malicious activity occurring.
To explore high levels of suspicious privileged account activity:
In this example there are a significant number of "failed password" events for the privileged "root" account that originate from the 18.104.22.168 IP address. Given that these events are all from one source it is likely that there is either a configuration error (perhaps a script attempting to access a remote service) or malicious intent behind the events.
The fundamental basis of this use case is in performance management. As part of a broader initiative the scenarios that have been presented here can be integrated into a performance management framework which ties together network, application and infrastructure monitoring. Many of these tools could also be data sources for the SIEM and provide additional context to the data gathered by traditional security logs.
Consider incorporating flow data, to help determine amount of network activity when one of the scenarios (particularly the first – a flood of events causing a large EPS spike) occurs.
Additionally when considering typical SOC activities the included DoS queries included in the QoS Dashboard would be beneficial for close monitoring by security personnel. These queries could easily be added to a standard SOC dashboard to enhance the view and provide timely notification of DoS type events. In addition this ties in well with McAfee's Global Threat Intelligence service to be able to determine if the source IPs are known bad actors.
Automation is a key capability of the McAfee SIEM. Many of the functions can be automated via alarm actions so that when specific events occur, action is taken without the need for human intervention. Consider the account lockout scenarios listed in this document – identifying those lockout activities could automatically cause an email to be send to your helpdesk to ensure proactive service to users who have locked out their accounts.