Has anyone created a baseline for the performance of their SIEM that goes beyond EPS and event distribution? I'm looking for metrics like:
Any feedback or other metric suggestions would be greatly appreciated.
Assuming you have an enterprise management tool capable of reading SNMP data. If you take a look at the MIB, there's a lot of good data in that can be added to tools that support it. There's things like incoming event rate, flow rate, CPU loads, HDD performance. Tracking this data over time can give you an good indicator as to when things start slowing down. All you have to query is the ESM for this, it already collects this data from all other devices for you.
So a MIB is a definition of what data is available via SNMP. You can download the ESMs MIB in the ESM Settings -> SNMP Configuration -> View MIB (near the bottom). Most network teams already have some sort of SNMP tool for monitoring things like switches and routers, so it is fairly likely your enterprise has one available.
Exporting this data to a tool that can graph it overtime is extremely valuable to get an idea as to what is happening as data sources are on boarded over time, and as the ESM's database generally fills up to retention.
It is very difficult to tell how many more data sources you can add. There are a lot of factors, such as; how many people are using the GUI on a daily basis, what kind of hardware the SIEM has at it's disposal, size and complexity of parsers, correlation rules, etc...
You can get an idea of Event Rate per second by data source by creating a "Bar Chart" with the built in query "Collection Rate Per Minute." There also exists a "Last Event Time" report In the ESM settings pane. ESM Settings -> System Information (default screen) -> View Reports -> Event Time.
Hope that helps.