Hi, hoping to get some help with an issue we are currently experiencing.
We are in the process of migrating across to Exchange 2013, setup is below.
Servers: 2 x DL380p Gen8
CPU: 2 x E5-2650 per server
Mem: 128GB per server
Disk: 12 Local disks, 2 mirror for OS and logs, 4 Raid 10 x 2 for DB's per server
There is currently a single dag configured with 2 active and 2 passive DB's.
DB1 Active copy is on Server1 with a passive copy on Server2, DB2 active copy on Server2 with passive on Server1.
Every morning at 4am when the DAT update runs for Mcafee one of the DB's will failover to the other server, at the same time the following event ID 906 is generated,
Information Store - DB1 A significant portion of the database buffer cache has been written out to the system paging file. This may result in severe performance degradation.
See help link for complete details of possible causes.
Previous cache residency state: 100% (3959 out of 3959 buffers) (109 seconds ago)
Current cache residency state: 0% (3 out of 5246 buffers)
Current cache size vs. target: 99% (1630.555 / 1637.938 MBs)
Physical Memory / RAM size: 131037.270 MBs
This is not a memory issue as we have enough and there is no load on the servers currently.
The latest firmware, drivers and updates have been applied.
Backups run at 18:00 everyday and normally finish before 19:00.
We have excluded all the required files and folders as specified in the Mcafee documentation, we have also added the required exchange processes to the Low-Risk policy and excluded from scanning as per the documentation.
With all the above done, the issue keeps happening.
If we disable Mcafee Access Protection, On-Access Scanner and the Framework service the problem goes away, so this appears to be an issue with Mcafee.
Have you enabled the Scan "Processes On Enable" setting, in the OAS configuration?
This feature scans memory. It is invoked after DAT update.
That forces Windows to do some memory management magic, which will include pagination of processes.
Setting was enabled so turned it off, one of the servers was fine this morning other still had the issue and had it's DB failed over.
I have updated both servers to Patch 6 and will reboot them tonight, check again in the morning.
Thanks for the help.
*** update ***
It seems the server that worked did not run its update at 4am for some reason.
When I kicked off the DAT update the event ID 906 occurred, so it seems that disabling the option made no difference.
That particular setting requires a reboot to be cleared from memory, unfortunately. Once it is on, it cannot simply be turned off. Though disabling it in policy is necessary of course, but it takes a reboot to make the change stick.
We're getting exactly the same DB performance degradation errors. The OAS 'Processes on enable" setting has been enabled since day 1.
Anyone got any further with this?