1 2 Previous Next 10 Replies Latest reply on Jan 13, 2010 10:57 AM by SafeBoot

    Indexing Malfunction Need Help

    mwilke

      I have set the LifeTime value of the dbcfg.ini file to "0" so that it never expires.  I dont want this to auto recreate the cache.  I want instead to use toast cache to re index the database twice per week.

       

      I have set the LifeTime to 0 and recycled the services, reboot the machine many times, etc on yesterday.  Today i noticed that it is still auto re-indexing as if the LifeTime was set to some other value other than zero.  

       

      The 00000001 and 00000002 folders (user and machines folders) keep re-indexing themselves every few hours though.  On my other three production servers, this isnt happening.  The two folders names.* files do not get updated until i run toast cache.

       

      Here is my dbcfg.ini file.

       

      [NameIndex]
      ; This must be set to "Yes" for the name caching to be used by programs running for this
      ; directory.
      Enabled=Yes

      ; This option controls how long the process will retry access to the index file if it is locked.
      ; The value is in 100ths of a second.
      LockTimeout=3000

      ; This option controls how long the process will sleep (wait) before re-trying opening of a
      ; locked file. The value is in 1000ths of a second.
      LockSleep=10

      ; This option controls how many "buckets" the hash of the name is split in to. It should be
      ; between 1 and 256 (default 16).
      HashCount=200

      ;The minimum space to allocate per object name, should be the same value as Hashcount
      MinEntrySize=200

      ; The time (in seconds) for which the index will be used before it is
      ; automatically re-created (default is 30 minutes). A value of zero means
      ; that it never expires. (86400 is one day)
      LifeTime=0

      What am i missing?  Why would the two folders continue to reindex themselves with this value set to zero?

      Any help would be GREATLY appreciated.  Thanks!

        • 1. Re: Indexing Malfunction Need Help

          nothing, that'a all you have to do.

           

          I guess the index is getting corrupted via some other outside influence? If the locktimeout is reached, it will consider the index corrupt and rebuild it, so you might want to try increasing that counter.

          • 2. Re: Indexing Malfunction Need Help

            How many objects do you have in the database?

            You may try to increase LockSleep to 100 and LockTimeout to 6000.

             

            Did you exclude database folder and EE processes from AV scanning?

            • 3. Re: Indexing Malfunction Need Help
              mwilke

              The DB consists of 43,000 users and right now only 3,000 machines.  We just started deploying to this database this week.

               

              It has set for 8 months or more with the 43,000 users on it and no machines.  The console was always opening up withing about 2 seconds of typing in the username and password.

               

              Since we started deploying, the console is taking 20 minutes to open if it opens at all.

               

              I shut down the DB service and tried logging in using the dll file locally to the console and it still hung.  The Apache WebHelpDesk is timing out when trying to authenticate.

               

              All this happened only this week when we tried to deploy.  We were only deploying 1000 machines per day, around 200 at a time.  What could have caused this?  Is there database corruption?  Is there some DLL file that is possibly corrupt?

              • 4. Re: Indexing Malfunction Need Help
                mwilke

                More tidbits here.

                 

                It looks like if you have an ID that was created early on such as ObjectID 00000003 or a few more early on like that ...... the DB opens fine.  Its only the objects that are high up in the chain like 00000c3fe that are having the issue.

                 

                It looks like once you reach a certain point, the authentication is not referencing the Index at all.  We have tried to delete the index, rebuild, etc multiple times but nothing is helping.

                 

                ToastCache runs successfully every time.

                 

                I suppose there must be some object in the DB that when the index tries to index that object it fails due to some corruption and then causes every object after that to fail also?  I am reaching here.

                 

                Anyone else got any ideas i can try?

                 

                We already have our DB sent to McAfee Development team and they are unable to figure this out as of right now.

                • 5. Re: Indexing Malfunction Need Help

                  it's a possiblity, but unlikely.

                   

                  how many things do you have connected to the actual files of the DB itself? It should be only one SBDBServer.exe task (everything else should go through it).

                   

                  interestinly, you can not login while the index is being built - you have to wait until its finished, BUT, if the locktimeout is less than the index rebuild time, then a 2nd connection will consider the index corrupt and start a rebuild before the first has completed.

                   

                  I would take everything offline, then toast it and watch what happens with filemon. Work out how long the index takes to build and compare that with the timeouts.

                   

                  Once it's built, I'd try a few logins etc, some tasks, and see if the index survives. Then bring it online and keep an eye on it to see if the index rebuilds for any strange reason.

                   

                  the most common cause of index corruption is mixing v4/v5 connections to it, for example people have upgraded, then used old scripts with old versions of the API to talk directly to the db (instead of through a sbserver).

                  • 6. Re: Indexing Malfunction Need Help
                    mwilke

                    it's a possiblity, but unlikely.

                     

                    how many things do you have connected to the actual files of the DB itself? It should be only one SBDBServer.exe task (everything else should go through it).

                    Right now we have all services shut down.  I am the only one logging in and out of the DB and i am logging in via the SBFILEDB local file.

                     

                    interestinly, you can not login while the index is being built - you have to wait until its finished, BUT, if the locktimeout is less than the index rebuild time, then a 2nd connection will consider the index corrupt and start a rebuild before the first has completed.

                    I am not trying to login while the index is building.  We do a search of the SBDATA folder for all names.* files --> delete them --> run toastcache --> after successful toastcache we then try to login.  Works fine with early IDs and still hangs with the later IDs. 

                     

                    I would take everything offline, then toast it and watch what happens with filemon. Work out how long the index takes to build and compare that with the timeouts.

                    Everything has been offline for several days now.  No Apache services, no DB Services running... nothing. 

                     

                    Once it's built, I'd try a few logins etc, some tasks, and see if the index survives. Then bring it online and keep an eye on it to see if the index rebuilds for any strange reason.

                    Once the index is built.... it immediatley fails with later IDs.  We can login fine with early IDs just not the later IDs.  Its like the later IDs do not reference the Index at all. 

                     

                    the most common cause of index corruption is mixing v4/v5 connections to it, for example people have upgraded, then used old scripts with old versions of the API to talk directly to the db (instead of through a sbserver).

                    This server has always been on 5.x.  Currently its at 5.1.8  The only script we use on this DB is AutoDomain

                    • 7. Re: Indexing Malfunction Need Help

                      how do you mean "it fails with later id's" - what exactly fails?

                       

                      does it cause an index rebuild?

                       

                      have you watched it with filemon? and what are you logging in with? Try doing a command line api getcounts call with a few ID's and see what goes on - that will make extensive use of the index.

                       

                      you should find it makes no difference which ID you use, but it's certainly possible that something is corrupting the index. Hard to narrow it down though without trying every user until you find the one which fails (a binary chop effort would be best).

                      • 8. Re: Indexing Malfunction Need Help

                        I should add that the index is a set of hash buckets, it's not liner at all, so the concept of "later id's" makes limited sense, apart from the fact that later id's will be listed later on in each bucket. (not sure why the profanity filter is flagging b u c k e t ?)

                         

                         

                        Message was edited by: SafeBoot on 1/13/10 11:24:13 AM GMT-05:00
                        • 9. Re: Indexing Malfunction Need Help
                          mwilke

                          Problem Solved.

                           

                          So the advice we got from McAfee and the advice we got from the default DBCFG.INI file that McAfee passes out is that the MinEntrySize should be the same as the HashCount.

                           

                          The HashCount should be the sqrt of the total number of users you have.  This would mean that our MinEntrySize would be set to about 200.

                           

                          We then talked to a very helpful guy in Tier III that told us the MinEntrySize should be set to however many characters your longest username is.  We dropped this down to 16, toasted, and problem was solved.

                          1 2 Previous Next