cancel
Showing results for 
Search instead for 
Did you mean: 

Events per device

Hi all,

I would like to create a dashboard on which I can see the total event count per device. Ideally, I would also like to have a baseline and the deviation from this baseline.

The goal is to abnormal behaviours... Servers who generate too much events or too less, based on their respective baselines.

I created a new dashboard with a table. I created a new query with which I select the fields "Device ID", "Device Name" and "Event count".

  • In my table, some devices have more than 1 line corresponding. I would like to group the result by device ID. But I don't find how.
  • I have no idea how I can add the baseline and the deviation... If able.

Maybe my approach is not a good one... ? I would appreciate any help or clue

Thank you in advance.

regards,
Guillaume.

ESM version : 9.4.2

9 Replies
aszotek
Level 10
Report Inappropriate Content
Message 2 of 10

Re: Events per device

There are few ways of doing this, solutions that work include:

- Bar charts - Collection rate [by Device] per second

- Distribution charts

My preference is to use Distribution charts per device type as it shows amounts of events over time, so I can quickly see deviations or interruptions (the latter being rather frequent on McAfee SIEM despite HA setup).

The downside is space limit (your screen) if you want to have chart for each single device.

Re: Events per device

Hi,

Thanks for the answer. Didn't saw it while typing my text

I will try those different approaches and keep you in touch.

Re: Events per device

So,

I now have a nice graph with the collection rate per second for each of my data-source, ordered by avg_rate, and a distribution chart bound to it. Much more easier than my approach ^^


Problem is even if I show baseline averages and margins, it will show all the data-sources, regardless of the deviation.

What I would like is to see only the sources which are under or above those margins, and/or order my sources by those deviations. 

I don't even know if this is feasible ^^

aszotek
Level 10
Report Inappropriate Content
Message 5 of 10

Re: Events per device

Baseline per data source is perfectly doable, not sure what graph you are referring to, can you present a screenshot?

Bar charts can show multiple deviations, if your distribution chart shows all of them, you can bound it to "collection rate per second" bar chart and distribution chanrt will change when you select each data source device. This is blind guess, based on my graphs.

Re: Events per device

Sorry not to be clearer I will do my best ^^


Right now, I have something like this :

deviation_rule.jpg

The problem with this solution is that it requires a manual interaction to see which data-source hasn't a normal behaviour. With more than 100 sources, it is not usable... 

This is why I would like to order my data sources not by average rate but by its deviation from its own baseline. Problem is baseline can be shown on the graph but cannot be used in the queries.

rth67
Level 12
Report Inappropriate Content
Message 7 of 10

Re: Events per device

Have you thought about setting up Alarm's for Devices that have a Deviation from the Baseline? Again, if you want one for each Data Source, you will need to create 100 separate Alarms.

This isn't feasible in our environment, as we have about 2,000 Data Sources on one ESM, and about 1,000 on our other ESM.  We have Deviation from Baseline Alarms for the Devices themselves (Receivers, ACE, APM), as well as for particular groupings like Firewalls, VMware Hosts, or simply for Unknown Events.

mcgarl1
Level 9
Report Inappropriate Content
Message 8 of 10

Re: Events per device

RTH,

What did you use for a baseline for your Alarms? A day, a week, a month?

rth67
Level 12
Report Inappropriate Content
Message 9 of 10

Re: Events per device

It varies, you have to tweak the alarm to fit your environment.

Some examples are:

General Deviation from Baseline for a Device:

     Query - Total Events; Time Frame: Last 8 Hours; Trigger when 90% below baseline; Check Rate - 1 Hour

Others include settings such as:

     Query - Total Events; Time Frame - Lat 1 Week, Trigger when - 50% above, 50% below; Check Rate - 12 Hours

     Query - Total Events; Time Frame - Last 2 Hours; Trigger when - 25% above; Check Rate - 1 Hour (Unknown Events from Unix/Linux/AIX systems)

The Unknown Events increase for certain types of systems is usually triggered by one of several events, either someone enabled Debug mode (switch or router typically), or in the case of some VMware hosts, the local storage for it's logs filled up, so then it logs even more events saying it is out of space, over and over.

We are currently dealing with a receiver (small older orange Nitro box) which has been getting "lowmem_reserve" messages followed up by "IPSDBServer[1541]: Error: Could not send event(s) to correlator through socket - Unable to obtain lock(4)" - when this occurs the receiver replies to a ping, but no longer processes events, does not allow ssh connections, and worst of all, does not accept incoming syslog messages. I just had to hard boot it again this morning due to this issue, after following up on the Alarm email, and viewing my dashboard and seeing that we had no events in the past 30 minutes from that receiver.


My default dashboard contains a small Distribution view for each "Device" we have, 9 Receivers, ACE, APM, ePO, plus an "EPS" gauge for Total Events per Second of the system, so I can quickly see when an issue may be taking place.


We also have "Device Failure" alarms for each device, that check every 10 minutes. there are occasional False Positives, but we usually see this prior to the Deviation from Baseline, as we check more frequently.

Re: Events per device

I created a Correlation rule with the following criteria :

deviation_rule.jpg

I also grouped by "Device ID". So my rule is as follow :

deviation_rule2.jpg

I then created a dashboard with a bar chart. I am not sure which event query I should select... I made few tests, with my rule's signature ID as a filter, but my chart stay empty.

And I really don't think none of our servers are in the red... Any idea?

bonus question : Threshold cannot be < 1. Doesn't it use normal distribution law?

Thanks in advance,
GE