what is your epo database size now? how many access protections events do you have? do you have dat reputation events lately?
90 Gb reserved, 23 Gb used.
During last month: 3.600.000 access protection events.
I have Dat reputations events, but problem appeared several months before check in Dat reputacion feature.
I think Mcafee has solved the issue of the Dat reputations events we do no see them anymore, but regarding the large number of events do you run the task(Purge Threat and Client Events Older than 90 Days)? with this large database, how is epo performing , is it slow?
What is your Agent-to-server communication interval set to?
What system is hitting 10k IOPs? The ePO app server or the database server?
Is the database server physical or virtual?
I'd be more inclined to believe that your storage team needs to do some additional investigating and help determine why 10k IOPs are available to you. If you are using many Intel Security products, there is definitely the potential for needing all those IOPs, but as others have suggested, you can tune some of that load through managing your ASCI interval, server tasks that may be executing complex queries, etc.
Yes, we run several purge events tasks. Performance of ePO server is OK, console OK.
40 minutes for servers (1000 endpoints)
45 for workstations (9000 endpoints)
is the database, it's in a SQL 2008 R2 cluster, SQL servers are virtual. We use many Intel Security Products (Agent, VSE, HIPS, Site Advisor, MSME, Drive Encryption...).
Server tasks have been reviewed, we have several tasks to sort some systems, they are executed every hour, but hight IOPS consumption is not everydays at the same time... it's a very rare case...
Do you think ASCI interval is too much? Maybe I could change it to 60 seconds, but I don't undestand why it occurs only some days...
You're probably experiencing check-in storms. The agents use a random check-in time based on the ASCI. So if x clients are on-line and y of those clients decide to all check-in close to the same time. You get a perfect storm. So you should research increasing the interval. I say research as one of my admins complains that large check-in values can interfere with Encryption. (I never checked if that was a valid statement) We have a similar number of endpoints and McAfee recommended 2 - 3 hour check-in.
I've also seen what happens to the database when someone sets 4000 clients to check in every 15 minutes. It was not pretty.