4 Replies Latest reply on Apr 27, 2010 1:38 PM by arnoldrs

    Super Agent Replication Failed

      Hi, I need your help. I came across your blog and I tried everything without sucesfully

       

      I have 13 SA Repository and the problem is with 4 of them.  The Super Agent Repository Replication Task failed every day at the same point (after Copy VSE870.msi) almost 45% after 1 hour 45minutes, I created a new task (testing) using just one SA Repository and the result is the same "Failure".

       

      I let you know all the changes tried and the Mcafee Ambient Configuration:

       

      * EPO 4.5 Patch 1

      * SA Repository Agent 4.0 Patch 3

      * Enterprise 8.7

      * I Create a new task replicating just one SA repository in a diferent schedule of the master task that replicate all the right SA repository without failure, and choosed the necesary packages (Engine, Dat, Agent) and clear the check Replicate legacy DATs

      * Reinstall the Agent in the SA repository (now it is using 4.0 patch 3)

      * Every Day the replication failed at the same point (after Copy VSE870.msi),

      * I am using advace Logging in the EPO Server

      * It´s not a DNS or NTFS Permission Problem

       

      Please Help me,

      I´ll send you part of the EPOAPSRV Log file:

       

      20100421042134 I #5284 naInet   ------------------------------------------------------------
      20100421042134 I #5284 SIM_InetMgr Session 1 ended, result=1
      20100421042134 I #5284 SiteMgr  GeneralInetRequestThreadProc: GeneralInetRequest thread ended
      20100421042134 x #5284 SiteMgr  SiteMgr main control final release...
      20100421043343 I #3084 naInet   HTTP Server returned success, HTTP return code: HTTP/1.0 200 OK

      20100421043343 I #3084 SIM_InetMgr Uploaded file VSE870.msi successfully in session 1
      20100421043343 I #1544 naInet   HTTP Session closed
      20100421043343 I #1544 naInet   ------------------------------------------------------------
      20100421043343 I #1544 SIM_InetMgr Session 1 ended, result=1
      20100421043343 e #1544 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed
      20100421043343 e #2724 SiteMgr  ReplicationThreadProc: Replication finished with partial failure
      20100421043343 x #2724 SiteMgr  SiteMgr main control final release...

       

       

      El mensaje fue editado por: Arnold Rojas on 21/04/10 04:11:39 PM CDT
        • 1. Re: Super Agent Replication Failed
          jstanley

          So I'm using the information you linked in the epoapsvr.log here:

          http://community.mcafee.com/message/127442#127442

           

          The last replication attempt failed to a site named ePOSA_NASAMCBO with these errors in the epoapsvr.log:

          20100426025936 I #2192 naInet   HTTP Session initialized
          20100426025936 I #2192 naInet   Connecting to HTTP Server in socket-mode
          20100426025936 I #2192 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081

          20100426025957 E #2192 naInet   Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
          20100426025957 E #2192 naInet   Socket error: 10060
          20100426025957 I #2192 naInet   HTTP Session closed
          20100426025957 I #2192 naInet   ------------------------------------------------------------
          20100426025957 E #2192 SIM_InetMgr Start session for site upload failed
          20100426025957 e #2192 SiteMgr  ReplicateSite: Failed to connect to site ePOSA_NASAMCBO
          20100426025957 I #2192 SrvEvtInf Generating Event
          20100426025957 e #2192 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed

           

          This is a straight-forward failure to connect. So ePO attempted to contact the super agent via the agent wakeup call port and failed to establish a connection. For this typically you need to make sure we have a route from the EPO server to the machine hosting the super agent repository, confirm the agent service is running on the client machine and that the frameworkservice.exe is listening on port 8081 on the machine hosting the SA reository.

           

          This appears to be an intermittant network issue as you can see a little higher in the log it was successfully connecting to the same site (this occured the day before):

          20100425030007 I #6208 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
          20100425030007 I #6208 naInet   Connected to HTTP Server: NASAMCBO.ven.rsa-ins.com

           

          However a little further along in the log on the same thread it is timing out trying to send a file:

          20100425032454 E #6208 NaiInet  Socket send error 10054
          20100425032454 E #6208 naInet   Failes to upload data in bytes: 65536
          20100425032454 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
          20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 4
          20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Uploading file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, retry limit remaining: 4
          20100425032456 I #6208 SIM_InetMgr Uploading file avvdat-5962.zip from session 1, LocalDir=C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000, RemoteDir=Current\VSCANDAT1000\DAT\0000
          20100425032456 I #6208 naInet   Uploading file C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000\avvdat-59 62.zip to HTTP Server
          20100425032456 I #6208 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
          20100425032517 I #6208 naInet   Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
          20100425032517 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
          20100425032517 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 3

           

          It goes on to re-try 5 times and gives up. Notice the return code windows is passing back to EPO:

          20100425032454 E #6208 NaiInet  Socket send error 10054

           

          So 10054 = Connection reset by peer (ref: http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx). This indicates something other than EPO is closing the connection. Could be one of several things, perhaps the agent service hosting the repository stopped on the remote machine? It could also indicate that at the time of the replication the WAN was so overloaded it couldn't process these requests in a timely fashion. The files the replication is failing on (avvdat-5962.zip and vse870.msi) are both larger files so maybe you don't have the bandwidth required to send those files over the WAN? To test this you could try manually copying one of those files from the EPO server to the machine hosting the super agent repository.

           

          I hope that helps get you going in the right direction.

          1 of 1 people found this helpful
          • 2. Re: Super Agent Replication Failed

            Hi Jeremy,

             

            Thank you for your answer is a really good explanation. I´m highlighting my comments, please feel free in ask me if you have another question or doubt:

             

            The last replication attempt failed to a site named ePOSA_NASAMCBO with these errors in the epoapsvr.log:

            20100426025936 I #2192 naInet  HTTP Session initialized
            20100426025936 I #2192 naInet  Connecting to HTTP Server in socket-mode
            20100426025936 I #2192 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081

            20100426025957 E #2192 naInet  Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
            20100426025957 E #2192 naInet  Socket error: 10060
            20100426025957 I #2192 naInet  HTTP Session closed
            20100426025957 I #2192 naInet  ------------------------------------------------------------
            20100426025957 E #2192 SIM_InetMgr Start session for site upload failed
            20100426025957 e #2192 SiteMgr  ReplicateSite: Failed to connect to site ePOSA_NASAMCBO
            20100426025957 I #2192 SrvEvtInf Generating Event
            20100426025957 e #2192 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed

             

            This is a straight-forward failure to connect. So ePO attempted to contact the super agent via the agent wakeup call port and failed to establish a connection. For this typically you need to make sure we have a route from the EPO server to the machine hosting the super agent repository,R= Yes, the SA hosting machine is a Server in the same domain in a remote branch connected using a dedicated network link of 256KBps

            confirm the agent service is running on the client machine and that the frameworkservice.exe is listening on port 8081 on the machine hosting the SA reository.

            R= Yes, the agent is running and i checked the agent log and didn´t see any events related to disconnections or similar

             

            This appears to be an intermittant network issue as you can see a little higher in the log it was successfully connecting to the same site (this occured the day before):

            20100425030007 I #6208 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
            20100425030007 I #6208 naInet  Connected to HTTP Server: NASAMCBO.ven.rsa-ins.com

             

            However a little further along in the log on the same thread it is timing out trying to send a file:

            20100425032454 E #6208 NaiInet  Socket send error 10054
            20100425032454 E #6208 naInet  Failes to upload data in bytes: 65536
            20100425032454 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
            20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 4
            20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Uploading file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, retry limit remaining: 4
            20100425032456 I #6208 SIM_InetMgr Uploading file avvdat-5962.zip from session 1, LocalDir=C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000, RemoteDir=Current\VSCANDAT1000\DAT\0000
            20100425032456 I #6208 naInet  Uploading file C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000\avvdat-59 62.zip to HTTP Server
            20100425032456 I #6208 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
            20100425032517 I #6208 naInet  Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
            20100425032517 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
            20100425032517 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 3

             

            It goes on to re-try 5 times and gives up. Notice the return code windows is passing back to EPO:

            20100425032454 E #6208 NaiInet  Socket send error 10054

             

            So 10054 = Connection reset by peer (ref: http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx). This indicates something other than EPO is closing the connection.

            Could be one of several things, perhaps the agent service hosting the repository stopped on the remote machine?

            R= The Agent service has not been stopped because if you check the agent log there are not events related to service stopped or similar

            It could also indicate that at the time of the replication the WAN was so overloaded it couldn't process these requests in a timely fashion.

            R= I have tried to run the task in different schedule and different time on day (peak and off peak hours) and the result always is the same

            The files the replication is failing on (avvdat-5962.zip and vse870.msi) are both larger files so maybe you don't have the bandwidth required to send those files over the WAN?

            R= ok i agree with you because both are larger files but my question and doubt is why when i copied both or more larger files using windows copy (copy and paste from remote to local path) i never get an error or failure and the copy process finish without problems?

             

            To test this you could try manually copying one of those files from the EPO server to the machine hosting the super agent repository.

            R= I did this at different times and the copying process from the EPO to the hosting machine is working fine and without problems

             

            I´m thinking in something happens using HTTP connections or 8081 port but i don´t know how to solve or detect this issue

             

            I really appreciate your help, i´ll be waiting your comments

             

            Regards,

             

             

             

             

             

            • 3. Re: Super Agent Replication Failed
              jstanley

              Unfortnately I don't have much beyond that. The log files provided clearly indicate that the connection was reset by the remote host and NOT that ePO itself terminated the connection. If you believe the problem has something to do with an HTTP file transfer or port 8081 then you could switch the repository to a UNC share which uses neither of the above.

              1 of 1 people found this helpful
              • 4. Re: Super Agent Replication Failed

                Thank you Jeremy,

                 

                I´m going to check all the server configuration, install all the widows update and others in order to try to solve the problem and in the last instance i´ll change the SA repositorie to UNC

                 

                I´ll let you know the result

                 

                Regards and thank you for your valuable help