cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted

Super Agent Replication Failed

Hi, I need your help. I came across your blog and I tried everything without sucesfully

I have 13 SA Repository and the problem is with 4 of them.  The Super Agent Repository Replication Task failed every day at the same point (after Copy VSE870.msi) almost 45% after 1 hour 45minutes, I created a new task (testing) using just one SA Repository and the result is the same "Failure".

I let you know all the changes tried and the Mcafee Ambient Configuration:

* EPO 4.5 Patch 1

* SA Repository Agent 4.0 Patch 3

* Enterprise 8.7

* I Create a new task replicating just one SA repository in a diferent schedule of the master task that replicate all the right SA repository without failure, and choosed the necesary packages (Engine, Dat, Agent) and clear the check Replicate legacy DATs

* Reinstall the Agent in the SA repository (now it is using 4.0 patch 3)

* Every Day the replication failed at the same point (after Copy VSE870.msi),

* I am using advace Logging in the EPO Server

* It´s not a DNS or NTFS Permission Problem

Please Help me,

I´ll send you part of the EPOAPSRV Log file:

20100421042134 I #5284 naInet   ------------------------------------------------------------
20100421042134 I #5284 SIM_InetMgr Session 1 ended, result=1
20100421042134 I #5284 SiteMgr  GeneralInetRequestThreadProc: GeneralInetRequest thread ended
20100421042134 x #5284 SiteMgr  SiteMgr main control final release...
20100421043343 I #3084 naInet   HTTP Server returned success, HTTP return code: HTTP/1.0 200 OK

20100421043343 I #3084 SIM_InetMgr Uploaded file VSE870.msi successfully in session 1
20100421043343 I #1544 naInet   HTTP Session closed
20100421043343 I #1544 naInet   ------------------------------------------------------------
20100421043343 I #1544 SIM_InetMgr Session 1 ended, result=1
20100421043343 e #1544 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed
20100421043343 e #2724 SiteMgr  ReplicationThreadProc: Replication finished with partial failure
20100421043343 x #2724 SiteMgr  SiteMgr main control final release...

El mensaje fue editado por: Arnold Rojas on 21/04/10 04:11:39 PM CDT
4 Replies
Highlighted
McAfee Employee
McAfee Employee
Report Inappropriate Content
Message 2 of 5

Re: Super Agent Replication Failed

So I'm using the information you linked in the epoapsvr.log here:

http://community.mcafee.com/message/127442#127442

The last replication attempt failed to a site named ePOSA_NASAMCBO with these errors in the epoapsvr.log:

20100426025936 I #2192 naInet   HTTP Session initialized
20100426025936 I #2192 naInet   Connecting to HTTP Server in socket-mode
20100426025936 I #2192 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081

20100426025957 E #2192 naInet   Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100426025957 E #2192 naInet   Socket error: 10060
20100426025957 I #2192 naInet   HTTP Session closed
20100426025957 I #2192 naInet   ------------------------------------------------------------
20100426025957 E #2192 SIM_InetMgr Start session for site upload failed
20100426025957 e #2192 SiteMgr  ReplicateSite: Failed to connect to site ePOSA_NASAMCBO
20100426025957 I #2192 SrvEvtInf Generating Event
20100426025957 e #2192 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed

This is a straight-forward failure to connect. So ePO attempted to contact the super agent via the agent wakeup call port and failed to establish a connection. For this typically you need to make sure we have a route from the EPO server to the machine hosting the super agent repository, confirm the agent service is running on the client machine and that the frameworkservice.exe is listening on port 8081 on the machine hosting the SA reository.

This appears to be an intermittant network issue as you can see a little higher in the log it was successfully connecting to the same site (this occured the day before):

20100425030007 I #6208 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425030007 I #6208 naInet   Connected to HTTP Server: NASAMCBO.ven.rsa-ins.com

However a little further along in the log on the same thread it is timing out trying to send a file:

20100425032454 E #6208 NaiInet  Socket send error 10054
20100425032454 E #6208 naInet   Failes to upload data in bytes: 65536
20100425032454 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 4
20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Uploading file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, retry limit remaining: 4
20100425032456 I #6208 SIM_InetMgr Uploading file avvdat-5962.zip from session 1, LocalDir=C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000, RemoteDir=Current\VSCANDAT1000\DAT\0000
20100425032456 I #6208 naInet   Uploading file C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000\avvdat-5962.zip to HTTP Server
20100425032456 I #6208 naInet   Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425032517 I #6208 naInet   Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425032517 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
20100425032517 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 3

It goes on to re-try 5 times and gives up. Notice the return code windows is passing back to EPO:

20100425032454 E #6208 NaiInet  Socket send error 10054

So 10054 = Connection reset by peer (ref: http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx). This indicates something other than EPO is closing the connection. Could be one of several things, perhaps the agent service hosting the repository stopped on the remote machine? It could also indicate that at the time of the replication the WAN was so overloaded it couldn't process these requests in a timely fashion. The files the replication is failing on (avvdat-5962.zip and vse870.msi) are both larger files so maybe you don't have the bandwidth required to send those files over the WAN? To test this you could try manually copying one of those files from the EPO server to the machine hosting the super agent repository.

I hope that helps get you going in the right direction.

Highlighted

Re: Super Agent Replication Failed

Hi Jeremy,

Thank you for your answer is a really good explanation. I´m highlighting my comments, please feel free in ask me if you have another question or doubt:

The last replication attempt failed to a site named ePOSA_NASAMCBO with these errors in the epoapsvr.log:

20100426025936 I #2192 naInet  HTTP Session initialized
20100426025936 I #2192 naInet  Connecting to HTTP Server in socket-mode
20100426025936 I #2192 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081

20100426025957 E #2192 naInet  Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100426025957 E #2192 naInet  Socket error: 10060
20100426025957 I #2192 naInet  HTTP Session closed
20100426025957 I #2192 naInet  ------------------------------------------------------------
20100426025957 E #2192 SIM_InetMgr Start session for site upload failed
20100426025957 e #2192 SiteMgr  ReplicateSite: Failed to connect to site ePOSA_NASAMCBO
20100426025957 I #2192 SrvEvtInf Generating Event
20100426025957 e #2192 SiteMgr  ReplicationThreadProc: Upload data to site ePOSA_NASAMCBO failed

This is a straight-forward failure to connect. So ePO attempted to contact the super agent via the agent wakeup call port and failed to establish a connection. For this typically you need to make sure we have a route from the EPO server to the machine hosting the super agent repository,R= Yes, the SA hosting machine is a Server in the same domain in a remote branch connected using a dedicated network link of 256KBps

confirm the agent service is running on the client machine and that the frameworkservice.exe is listening on port 8081 on the machine hosting the SA reository.

R= Yes, the agent is running and i checked the agent log and didn´t see any events related to disconnections or similar

This appears to be an intermittant network issue as you can see a little higher in the log it was successfully connecting to the same site (this occured the day before):

20100425030007 I #6208 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425030007 I #6208 naInet  Connected to HTTP Server: NASAMCBO.ven.rsa-ins.com

However a little further along in the log on the same thread it is timing out trying to send a file:

20100425032454 E #6208 NaiInet  Socket send error 10054
20100425032454 E #6208 naInet  Failes to upload data in bytes: 65536
20100425032454 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 4
20100425032454 I #6208 SiteMgr  ReplicationUploadFile: Uploading file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, retry limit remaining: 4
20100425032456 I #6208 SIM_InetMgr Uploading file avvdat-5962.zip from session 1, LocalDir=C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000, RemoteDir=Current\VSCANDAT1000\DAT\0000
20100425032456 I #6208 naInet  Uploading file C:/PROGRA~1/McAfee/EPOLIC~1/DB\Software\Current\VSCANDAT1000\DAT\0000\avvdat-5962.zip to HTTP Server
20100425032456 I #6208 naInet  Connecting to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425032517 I #6208 naInet  Failed to connect to Real Server: NASAMCBO.ven.rsa-ins.com on port: 8081
20100425032517 I #6208 SIM_InetMgr Upload file avvdat-5962.zip failed in session 1, nainet ret=10054
20100425032517 I #6208 SiteMgr  ReplicationUploadFile: Failed to upload file avvdat-5962.zip to site ePOSA_NASAMCBO::Current\VSCANDAT1000\DAT\0000, hr=-2147467259, retry limit remaining: 3

It goes on to re-try 5 times and gives up. Notice the return code windows is passing back to EPO:

20100425032454 E #6208 NaiInet  Socket send error 10054

So 10054 = Connection reset by peer (ref: http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx). This indicates something other than EPO is closing the connection.

Could be one of several things, perhaps the agent service hosting the repository stopped on the remote machine?

R= The Agent service has not been stopped because if you check the agent log there are not events related to service stopped or similar

It could also indicate that at the time of the replication the WAN was so overloaded it couldn't process these requests in a timely fashion.

R= I have tried to run the task in different schedule and different time on day (peak and off peak hours) and the result always is the same

The files the replication is failing on (avvdat-5962.zip and vse870.msi) are both larger files so maybe you don't have the bandwidth required to send those files over the WAN?

R= ok i agree with you because both are larger files but my question and doubt is why when i copied both or more larger files using windows copy (copy and paste from remote to local path) i never get an error or failure and the copy process finish without problems?

To test this you could try manually copying one of those files from the EPO server to the machine hosting the super agent repository.

R= I did this at different times and the copying process from the EPO to the hosting machine is working fine and without problems

I´m thinking in something happens using HTTP connections or 8081 port but i don´t know how to solve or detect this issue

I really appreciate your help, i´ll be waiting your comments

Regards,

Highlighted
McAfee Employee
McAfee Employee
Report Inappropriate Content
Message 4 of 5

Re: Super Agent Replication Failed

Unfortnately I don't have much beyond that. The log files provided clearly indicate that the connection was reset by the remote host and NOT that ePO itself terminated the connection. If you believe the problem has something to do with an HTTP file transfer or port 8081 then you could switch the repository to a UNC share which uses neither of the above.

Highlighted

Re: Super Agent Replication Failed

Thank you Jeremy,

I´m going to check all the server configuration, install all the widows update and others in order to try to solve the problem and in the last instance i´ll change the SA repositorie to UNC

I´ll let you know the result

Regards and thank you for your valuable help

You Deserve an Award
Don't forget, when your helpful posts earn a kudos or get accepted as a solution you can unlock perks and badges. Those aren't the only badges, either. How many can you collect? Click here to learn more.

Community Help Hub

    New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.

  • Find Forum FAQs
  • Learn How to Earn Badges
  • Ask for Help
Go to Community Help

Join the Community

    Thousands of customers use the McAfee Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership:

  • Get helpful solutions from McAfee experts.
  • Stay connected to product conversations that matter to you.
  • Participate in product groups led by McAfee employees.
Join the Community
Join the Community