There are a few different ways that a SIEM can assist with the detection of exfiltration.
Let's start with setting the scope: Exfiltration is a use case that relies on some combination of data flow behavior and geolocation. But more specifically, the thing we are trying to detect might be stated as:
"Detect that sensitive data crossed a gateway towards an undesirable destination".
Data will always be entering/leaving the network; for it to be exfiltration, it needs to be both data that we care about and going to an unauthorized destination. To support this use case, we need to assume that we've characterized some data as "sensitive" to allow for that data to be tracked. It may be that the data resides on a particular server or is only accessible via a certain protocol; but something will uniquely define the data. Without some definition than all data leaving the network would have to be considered exfiltration and if that is the posture to start from then the network shouldn't be connected to another that might pose a risk. There should also be a consideration for the quantity of data on a per source/destination basis but quantity alone is generally not a reliable indicator of value.
The second piece is to define an "undesirable destination". This could be summarized as most anything that "isn't us" with some exceptions. Since we are only tracking the data that is considered "sensitive" then we can limit our scope for this use case to devices that hold the data and the devices that access them. Literal geolocation isn't so helpful here as any destination may be a proxy but it's a data point to be considered. We'll revisit this part shortly.
We should also establish some additional assumptions. The SIEM is only as good as the data being sent to it. Importing asset, service, vulnerability and criticality information from ePO, Active Directory or any vulnerability scanner is going to create an asset database in the SIEM and provide context to calculate the severity of the events. This means that assets with critical data are going to be defined along with all of the services they run and any vulnerabilities they have. This is best practice to help prioritize your events.
Ultimately, any sort of direct regex or data pattern match on the wire requires a DPI device (IDS/IPS/PCAP/DBM/ADM) to see it and hopefully it's not encrypted. Flow logging should be enabled as needed on to capture the byte counts regardless if a rule was triggered or not.
The path to exfiltration detection using a SIEM is often indirect. It is more likely to detect behaviors, techniques or tools of the attacker rather than the actual exfiltration itself. Some of the other indicators may be related attack behaviors like brute force attempts, network scans or phishing emails. Most of these types of events are created by security solutions like mail/web gateways and endpoint protection correlated along with logs from critical hosts. Access auditing can be collected from OS object logging for additional granularity if needed. If the scope is somewhat limited, it would be possible to automatically add all hosts that communicate with the Critical Assets to a watchlist and apply a different level of monitoring to that group.
Rules define the outer limits of the use case. One way to make the rules immediately more interesting is to introduce a threat feed. McAfee GTI includes threat reputation for millions of IP addresses. This can go a long way towards improving the definition of "undesirable location". Every host that talks to any host on your network will be compared to known lists of suspicious and malicious IP addresses and can be immediately alerted and remediated automatically and especially so for your defined critical assets. Additional threat feeds are always welcome context also. A threat feed match against a known malicious IP address communicating with your network is low hanging fruit.
To dig deeper into the middle we can leverage statistical anomaly detection. The SIEM supports baseline calculation as a native aspect of the underlying database. It's possible to instantly calculate the baseline for every object in the log for any period of time. Intervals of 5 are used. For instance, the event count for every IP address for this Monday would automatically be compared to their activity the previous 5 Mondays and a baseline would be calculated. I included a screenshot as an example. The tick shows the baseline and anything beyond the tick is red.
Any period of time can be calculated to create a baseline for any unique value of a field and this can be leveraged in two ways. Direct alerts can be used when any IP/user/host/anything is x percentage of deviation away from its baseline. Beyond that, the correlation rules above can be enhanced with “deviation components” that make the anomaly baseline a condition of the rule. The scenario is better defined when the rule fires as “External destination” AND “X percent outside of baseline for bytes/packets” from a Critical Asset.
Anomaly detection is great for showing what is different in a vacuum, but we’re trying to detect a scenario that could span different attack vectors and involve some analytics. Time for risk analytics. The tool also has the capability to calculate risk for any type of data. Most common for tracking risk scores are users and IP addresses. Each unique value is assigned a risk score based upon the type, severity and quantity of the events in relation to all of the asset, service and vulnerability data that was imported earlier. Exfiltration indicators from an asset deemed critical will be calculated into the score and increase the visibility of these types of events in the SIEM.
The exfiltration content pack description mentions most of these things also. It also includes some additional rules that cover some of the ancillary behaviors I mentioned like IM file transfers and abnormal communication from a high value host. You might need to fill in some gaps if there are other methods or protocols that could be used for exfiltration in your environment.
If a greater level of control is required then it is time to consider a proper DLP solution and/or database monitor, which would also feed the SIEM and ultimately be the best tools for the job.
This is great, Andy, I really appreciate the dissertation, as it goes a long way in addressing the variables that make up possible approaches to looking for exfiltration.
My environment does have a DBM that is underutilized, and this is a good way to bring some visibility to the information it's collecting.
Again, thanks a bunch!