but we have seen several instances of this during the initial sync... This causes a problem as it initially creates a record (correct machine name and new Machine ID) then starts sync of users, etc. but then fails, which ultimately causes 'Abandoning boot protection installation':
>> Checking for SSO updates Error [5c000012]: Failed while sending communications data Checking for hashes updates Checking for file updates Error [5c000012]: Failed while sending communications data Abandoning boot protection installation Applying configuration Synchronizaion complete (???? what, how?!?! ;-) <<
However, when the machine is rebooted and can sync it seems to be unaware of its previous registration and creates a duplicate record i.e. <machine name>0001 and new Machine ID (db object ID)
As a result we have several duff entries in our db and duplicate/incorrectly named records...
I'm not exactly sure of the circumstances involved in each of these cases, one of our technicians has just brought it to our attention, but it seems to have happened approx 10 times out of 100 or so machine builds...
Any thoughts or ideas on how to mitigate this problem?
It could be that your database is slow. Is the SBData folder on a local drive or is it on a NAS? Make sure the database is stored locally, on a fast drive. Also make sure that a backup or reporting job is not running during your deployment. Finally, if your environment is 5,000+ nodes, you should look into enabling indexing on your database. This basically entails putting a dbcfg.ini file at the root of SBDATA. The parameters are shown in the admin guide, but I'm sure support could send you one or walk you through the config.
It could also be that your network is slow. When customers report this issue it is almost always when they have a server in one country and users in a distant country. A common root cause here is a poor WAN link. Are the failing installs happening in a far away land? If not, this will become a general network troubleshooting exercise. Network problems can also occur if you have too many concurrent connections. We support 200 out of the box, but those can be quickly consumed if you are assigning a large number of users to each machine. It takes ~1 second for each user account to be pulled down to a machine during that initial sync. So if you are pulling down 1,500 users, then the sync will take 25 minutes to complete. That means connections will start to get dropped if you deploy to more than 200 machines in 25 minutes.
One final thing to check on the network side is the client firewall. I saw this once and the customer had the Windows firewall turned on. We made an exception for port 5555 (the default port used by our agent) and then the error stopped. You'll notice that the other post said this issue was fixed if they forced a sync from the server. That probably worked because forced syncs come on port 5556, which the firewall may not mind.