We now have had our McAfee SIEM for 10 months V9.4 HF3. Last couple of months we have been struggling to keep the system afloat, with a list of issues, but the primery issue seems to be loss of data after a database rebuild. Symptoms are normally the same, the system will freeze or logs\events will stop coming into the ESM GUI ( viewable). This will cause us to try and restart the services, cpservices stop and start. But most of the time the services wont stop, will hang on cpservices stop forever and wont get to prompt. So we will then have to force a reboot, when i say force a reboot, often the reboot command wont reboot the server first time and you have to enter reboot command twice... a little strange.
However the outcome after a reboot is the database rebuild, now just to make things clear, we are currently forced to reboot the ESM once or twice a week ( over last months), but a rebuild can also occur between reboots automatically. My assumption is that the rebuild fixes errors with the DB, but more than often, when this occurs i get millions of logs and events wiped from the GUI.
As a current example, this week i have been monitoring the system, logs and event have been coming in as expected Mon - Fri, i checked the DB with command
DBCheck -d '/usr/local/ess/data/ngcp.dfl|127.0.0.1|1111' –c all was fine, no errors reported
then Friday evening logs stopped coming in, so tried to restart service. But cpservice stop command, just hung for 3 hrs, so had to reboot. The ESM went into rebuild database mode and when finished, i lost all events from Monday and Tuesday of that week. But they were there before!!!!!
Been working with McAfee support,, regarding issues around partitions, but this problem isnt going away. To add to this, im thinking that this issue is also causing corruption of files, as over problems are starting to develop like, error when trying to enable syslog auto learn, and some other rules based errors.
Anybody have similar issues and how was it resolved.