This content has been marked as final. Show 17 replies
it all depends on your virtual environment, how fast the storage is etc.
in our experience the virtual team have trouble understanding that this system needs quality tier1 dedicated storage, and thus when you migrate the performance drops through the floor.
as long as you are happy to fight that battle and win, there's no reason why a virtual machine won't be as good as a physical.
My company has had some issues with our implementation in a virtual environment. Currently we host around 5000 machines and 5000 users (so 10000) total database objects. Our issues are as follows:
VM is running with 4 CPU and 4GB of RAM. VM is clustered on shared NAS.
1. VERY slow synch times
2. CPU overutilization (avg peak 92-98% consistently) that has caused the sbdbserver.exe process to crash.
For months we have tweaked settings and db config and gained very little performance. Our infrastructure was not such that we were allowed dedicated teir 1 on our NAS so we have had to cope with our current setup.
Yesterday, after another crash of the db process, we called McAfee support and after some time searching through their documentation, we were provided with a newly updated best practices guide (June 2009). The documentation lists hardware specs depending on your need and, above 5000 users/PCs, recommends NOT using a virtual server.
Here is the link: https://kc.mcafee.com/corporate/index?page=content&id=PD21801
Edit: Also see McAfee KB65747 https://kc.mcafee.com/corporate/index?page=content&id=KB65747&actp=search&search id=1251831552711
NAS is very different to SAN, probably around 1/20 of the theoretical performance of DAS, and 1/50 of SAN.
If you moved to DAS you'd get a significant performance increase at least.
Indeed, that is what is recommended with the new documentation.
Edit: Just thought I should add that this does not address the known issue with VM listed in the KB above.
I know that we have one customer who completed a 20,000 node deployment on a virtual server. The project was a complete success.
I have also done load testing on virtual servers of up to 150,000 machine objects and 150,000 user objects. The performance was high enough for the customer to opt for a VM instead of physical. Load testing is pretty simple, you just create a script that runs our createmachine and createuser command X times.
I have to agree with the previous comments about NAS vs. SAN. Our database needs fast disk I/O. If you can get that on a VM, you'll be just fine. NAS is just too slow for any sizeable deployment. SAN works great, but only if you give it tier I or tier II access.
Why does the documentation I provided say otherwise?
because the documentation is our reccomendation.
most people who virtualize have no idea about the difference between TierI and TierII SAN storage, they just expect it to work as it did before, and are thus woefully disappointed when it does not.
It's fascinating how many people demand we make our code run faster, when the problem is their SAN is slow and they are sharing LUNs with a dozen other applications (but don't know it).
Just for clarification, I was wrong about our storage setup for this server. We are actually using a SAN setup with shared meta LUNs. The SAN is fibre channeled and is around 4Gbps throughput currently during optimum load times.
Also, there was a question that was posed today when meeting with our McAfee tech rep that was left unanswered. He was asked for more information on disk I/O such as read and write specifics and data throughput recommendation but could not provide a definitive answer or supplemental documentation and, in my time researching this issue, I have not come across any advanced documentation as far as system specs are concerned other than the basic specs that are listed in the best practices guide.
Is there any advanced documentation that you can provide to assist with our deployment?
Nothing on disk usage, no. It's a piece of string question. It depends on how many machines try to sync in a set window of time, and what information they need.
You could host a million machines on a laptop if only one synced at a time. Conversely, the biggest server/network in the world would fold if 10000 machines tried to sync together.
SAN with shared LUN's is not ideal if you want the highest possible performance, but it may be adequate for your needs. If you get to a state where your shared LUN environment is not fast enough you have a couple of choices, dedicated LUNs or migrate to DAS.
The enterprise implementation guide tells you the spec we recommend for standard environments, you should base your system to be "as performant" as those.