cancel
Showing results for 
Search instead for 
Did you mean: 

Re: Extremely slow and at sometimes completely unresponsive DB

You do not check through EEM. Just RDP to server and check timestamps on all database name cache files.

mwilke
Level 7
Report Inappropriate Content
Message 32 of 40

Re: Extremely slow and at sometimes completely unresponsive DB

Pete, here is what we have set in our DBCFG.ini file

; The time (in seconds) for which the index will be used before it is
; automatically re-created (default is 30 minutes). A value of zero means
; that it never expires. (86400 is one day)
LifeTime=86400

I have already asked for permission to set this to "0" and let our script take care of re-creating this after hours.

Right now we have a script that does all kinds of things including toastcache.bat and this runs twice a week.

Re: Extremely slow and at sometimes completely unresponsive DB

I would rather change lifetime to a larger value. Like one month. It is more explicit, then putting "0". But it should have the same effect.

Yes, you need to get this done ASAP. Coincide this change with your cache rebuild maintenance time window.

Re: Extremely slow and at sometimes completely unresponsive DB

What do other companies do with 18,000 to 20,000 machines as far as connection limits and such to avoid such slowness?

DLarson
Level 12
Report Inappropriate Content
Message 35 of 40

Re: Extremely slow and at sometimes completely unresponsive DB

What about your AV settings on the server? Did you follow the advice I posted previously in this thread? You really shouldn't mess with anything else until you have done that.

mwilke
Level 7
Report Inappropriate Content
Message 36 of 40

Re: Extremely slow and at sometimes completely unresponsive DB

Yes i asked our guys who are over that and they said that all the things you suggested AV wise has already been setup that way.

I think the problem is this:  I just did a netstat and there are over 500 incoming connections (syncs)

This causes the remote admin consoles not to load with "failed while sending communications data" because there are not connections available for them to login with.

The times when its very slow are the times that there are maybe only 240 connections coming in (we have our limit set to 250 connections to the db at any given time)

I think it is just the volume of incoming syncs.

DLarson
Level 12
Report Inappropriate Content
Message 37 of 40

Re: Extremely slow and at sometimes completely unresponsive DB

I think you are right. That is a lot of concurrency. I know a customer here in the US that has 110,000 nodes deployed and they see an average of 80 concurrent connections at any given time. Here are some things you can try.

- Set the scheduled sync to no less than 360 minutes. Also consider disabling it for groups of users/machines who work in the office every day. They will sync with the boot sync.

- Set the delay sync at boot to 10 plus random of 120

- Reduce remote admin connections, switch to Citrix if possible. How many PW resets do you do in a day?

- Take away admin rights so low level admins can't spend time "browsing" the server

- Speed up the database so that each sync takes less time

-- reduce # of users assigned to each machine. Get it down to 20 or less, if possible. How many do you have now?

-- modify indexing, set time to 0 and control re-indexing with script

-- clear audits every day, keep as little data as possible (set -daysold:30 if possible)

-- Tweak AV settings, don't just exclude SBDATA. Also set our applications as low risk and make sure they are ignored on reads and writes to disk

-- Throw hardware at it. It is OK to have 500 concurrent connections, but only if your hard disks are super fast and you have lots of CPU and Memory ... but maily HD speed.

-- Make sure you have the KeepAliveTime registry entry in place as per https://kc.mcafee.com/corporate/index?page=content&id=KB60490

-- Make sure database and applicatio are on different disks, i.e. don't put both on the C: drive

-- Make sure SBDTA directory isn't shared

-- Check schedule for jobs like reporting and database backup, ensure they don't overlap

-- If you are using SAN storage make sure you have at least Tier II access, get Tier I if possible

Re: Extremely slow and at sometimes completely unresponsive DB

I can answer the password question.  I usually don't see more than 5 a week, but I know other admins probably don't educate users on local recovery.  I really try to stress it.  Unfortunately, user seem to forget their answers.  Hence the reset local recovery discussion I started.

mwilke
Level 7
Report Inappropriate Content
Message 39 of 40

Re: Extremely slow and at sometimes completely unresponsive DB

- Set the scheduled sync to no less than 360 minutes. Also consider disabling it for groups of users/machines who work in the office every day. They will sync with the boot sync.

**Right now its at 240 and i have already suggested getting this way up there like you say.

- Set the delay sync at boot to 10 plus random of 120

** Right now its dealy sync at boot 30 plus random of 240.  I suggested a dealy of 10 and random of 90 but no word on if the agency wants to approve this.

- Reduce remote admin connections, switch to Citrix if possible. How many PW resets do you do in a day?

** PW resets i dont know.  That is handled by the agency admins.  I wouldnt think too many.  I know we already have a terminal server setup that most of them use.

- Take away admin rights so low level admins can't spend time "browsing" the server

** I think there are around maybe 20 people with admin rights on this server?  Not many

- Speed up the database so that each sync takes less time

** What else can i do to speed up the database besides prune audit logs, indexing, etc?

-- reduce # of users assigned to each machine. Get it down to 20 or less, if possible. How many do you have now?

** Only admins and the actual users that use the machine are assigned to machines.  So maybe in total 25-30 users per machine.

-- modify indexing, set time to 0 and control re-indexing with script

** Already did that today

-- clear audits every day, keep as little data as possible (set -daysold:30 if possible)

** Doing that sometime this week... have to wait to have maintenance window approved

-- Tweak AV settings, don't just exclude SBDATA. Also set our applications as low risk and make sure they are ignored on reads and writes to disk

** We excluded the SBDATA folder but our AV guy has no idea how to set these apps to low risk and/or make suer they are ignoring read writes.  I looked at the Symantec client on the server and I dont see anything there either.  He stopped the Symantec services today but it didnt seem to make any difference in the performance.

-- Throw hardware at it. It is OK to have 500 concurrent connections, but only if your hard disks are super fast and you have lots of CPU and Memory ... but maily HD speed.

We are already on a DAS Raid5 but not sure what speed the HDDs are.

-- Make sure you have the KeepAliveTime registry entry in place as per https://kc.mcafee.com/corporate/index?page=content&id=KB60490

** this is already in place

-- Make sure database and applicatio are on different disks, i.e. don't put both on the C: drive

**They are both on the 😧 Drive .... should we move the application to the C: drive?

-- Make sure SBDTA directory isn't shared

** it isnt

-- Check schedule for jobs like reporting and database backup, ensure they don't overlap

** they dont

-- If you are using SAN storage make sure you have at least Tier II access, get Tier I if possible

** we are on DAS

I think most of our problem is coming from the amount of folks hitting the server at once.  That and the fact that they are spread across four time zones.  This makes it hard to coordinate large groups of people from overlapping each other in their syncs and getting this large number of syncs at once.  Also, our syncs are open to the internet so people could be on (God forbid) dialup connections trying to sync all across the country.

Re: Extremely slow and at sometimes completely unresponsive DB

From your answers it looks that high number of concurrent connections is driven by frequent, slow (global - large latency), unreliable (Internet) and large payload (too many users per machine) client connections. Broken connections can linger some time before getting dropped. Do you monitor network load on your server? Can you post some data? Currently the server I have, peaks at about 2.5MB/s for 1-1.5 hours, with relatively small number of concurrent connections (80-110).

More McAfee Tools to Help You
  • How-to: Endpoint Removal Tool
  • Support: Endpoint Security
  • Visit: Business Service Portal
  • More: Search Knowledge Articles
  • ePolicy Orchestrator Support

    • Download the new ePolicy Orchestrator (ePO) Support Center Extension which simplifies ePO management and provides support resources directly in the console. Learn more about ePO Support Center