Stop the Mcafee Event parser service, cut ONLY the XML and .PKG content out of the EPO install directory;
Paste the content to an alternate location for safe keeping. Restart the Event parser service and retry the upgrade/installation.
The same problems exist in 5.3.2 build 400, if you are having problems with the event parser service failing due to mass amounts of PKG files being located in the folder generated by Agents. Linux servers are a culprit of mine that have multiple AV installations on (ClamAV & Mcafee) as they end up fighting over file locks and generating masses of agent events for the EPO to parse.
Once you have finished the upgrade/install, you can cut/paste the content back in to the event parser folder you cleared earlier. It wouldn't be advisable to do this in bulk, maybe 5PKG files at a time. The event parser service extracts the PKG files automatically to disk and drops the .xml content for the service to parse the event. If you drop too many all at once, if you don't have sufficient CPU and Disk resource to cope with them all at once it will crash the service. They need to add a queuing mechanism in to the service, but i doubt they will ever fix it.
Thanks for the confirmation, I've managed to get past that initial hang now, however after however long it takes I then get hit with the "install wizard was interrupted" at the end and I can't get past that
Could you let me know where the various error logs are that I could check to see if I can find out what could be happening?
You will want to be looking in the following folder for troubleshooting the installer logs;
This will have the install, debug and MSI log.
Just out of curiosity what problem are you having? Is it agent events locking up the event parser service?
There are two issues I'm having:
One is I'm trying to upgrade from 5.1 to 5.3 and it's constantly failing the upgrade. The second is that over the last month or two the EPO server has been dying a severe death and we've recently found that the epo events folder had 65,000+ xml files waiting in the Events folder and a further 100,000+ in the debug folder. Insane RAM use on SQL then became a bit of an issue but more critical like that were that the sqlserver was pretty much chewing up the disk i/o mainly on read
In fact I've just turned the event parser back on and in a matter of 5 minutes a once empty Events folder now has over 20,000 xml files in there, and watching resource monitor the sqlserver just hit over 19,000,000 B/sec before I've had to pull the plug and stop the parser service
I've got another thread going here :Failed upgrade installation of either ePO 5.3.1 or 5.3.2 for the installation failure. I'm probably going to start another thread up for the major overuse of resources once the event parser is on
Well I can tell you now that the Event Parser issue is not yet fixed, it falls on its face due to a lack of queue limiting and it just simply has too much work to do for the resources that are likely to be assigned to a Mcafee EPO server.
Hard disk IO is an issue for the event parser under large implementations, even with log filtering down to a minimum some servers will spam threat events in certain scenarios. I have dealt with the issue for a while and here's what I used to mitigate.
This folder gets the XML, TXML and PKG files from the agents, PKG files are a compressed 10Mb version which encompasses an enormous amount of XML files which have been generated by the agent. From what I can tell, if the agent needs to send data in bulk to the EPO, it does so by packaging them up as PKG files. When the PKG is dropped in the Events folder, the Event Parser will spool through the files churning through them one by one with seemingly no 'hard limit' on the amount it tries to transact at any one time. In a small implementation, this is unlikely to be an issue but in large ones its a problem.
The problem is Disk IO and CPU time waiting for the events to extract (if in PKG), process and clear. Disk IO requirements are quite high in these scenarios. To get around this you can attempt to use faster storage such as SSDs, SAS disks or hardware HBA/RAID controllers that support write back caching, however even with all of these I was still seeing IO performance problems due to the amount of agent events being handled.
I instead looked in to RAMDisking the hard coded *DB\Events folder with no real luck, I was going to use a RamDisk as some freeware options are available. Instead I was forced to use a product named PrimoCache Server to place an active memory cache in the Windows OS's file system filter driver.
What this essentially does, when configured correctly is intercept file system IO and place all requests in memory read and/or write for a particular volume or partition. I have mine configured to perform read and write caching with a 3 second delay on write from memory to disk, this way the impact to the IO on the disk is staged and doesn't overwhelm the HBA controller and the events can essentially be parsed in memory reducing the wait time on disk IO allowing the CPU time to be significantly reduced.
The flow would be
Disk IO request > Primo NTFS intercept filter > Write to memory > Provide confirmation of write to operating system > Store in memory cache > Defer write to phyiscal disk after 'x' seconds.
Here is the config I use;
Bare in mind there are risks to using such software, it will affect the whole partition as it cannot be specified for a folder. Use the write delay time wisely, remember that if the server has a hard failure, power drop etc then anything in memory will be lost. If you couldn't care less about losing a few events in the event of an outage, then its not really a big deal. Use the software at your own risk and it isnt free but does give you a lengthy 60 day trial to see if its for you. It requires a reboot to install and to remove so be aware of that.
Hope this helps, also I don't have any affiliation with Primo Cache I simply use it because it works for certain use cases.