We've been having some big trouble here over the last couple of days with email delivery to McAfee customers: messages have begun to bounce with
Jun 9 17:57:30 mailer delivery/smtp: 1BC0F232B9: to=<firstname.lastname@example.org>, relay=b.com.inbound15.mxlogicmx.net[126.96.36.199]:25, conn_use=145, delay=20881, delays=20800/81/0.19/0.12, dsn=4.0.0, status=deferred (host b.com.inbound15.mxlogicmx.net[188.8.131.52] said: 451 Exceeding connection limit: RBLDNSD (a0927755.0.18328025.00-2294.33131513.p01c11m095.mxlogic.net) (Mode: normal) (in reply to RCPT TO command))
and as a result we're holding tens of thousands of valid emails.
To be clear here: nothing has changed on our end in months. We are delivering the same content to the same customers at the same volume. All of these emails are communication between paying customers of our product. These are very well known IPs.
So far in postfix I've reduced our concurrent delivery to 2 connections per recipient domain, and added a 1 second rate delay between deliveries. I've also verified we're using the default backoff of 300 secs minimum to 4000 secs maximum when messages are deferred and retried. Unfortunately it hasn't made a difference.
Apologies for bringing this here, but we've tried multiple avenues and haven't managed to find anyone who can actually help us get this fixed.
Has anyone dealt with this before?
I can confirm that we started experiencing this same issue in the last 24 hours. I wasn't aware that it was narrowed to Mcafee customers. I'm definitely seeing where I can email some domains without issue and certain ones I can't and I'm seeing the 451 error code.
I have also confirmed that all the domains we are getting the error for, contain mxlogic.net in the status code error.
The amount of emails we are sending to each domain varies. Some as few as 2 emails.. others up to ~100.
McAfee added some sort of rate limit protection recently. Maybe its too damn aggressive, but you can disable it.
What a pain this turned out to be. It was four days with no email from our content management system. One of the biggest problems was that this feature got implemented without notice AND that there is no visibility in the administrative console of non-quarantined messages which are blocked, i.e.: unless someone asks you to go hunting for specific missing messages, you won't be notified that they are being blocked.
This will likely be a decent fix for something like a content management system which blasts out emails to everyone in our organization anytime there is a new post or an update:
Add the affected sender IP address or domain to your allow list. Senders on the allow list are not rate limited.
From the support article:
It is important to note that not all senders are held to the same rate limiting threshold. Factors including historical volume and machine learning algorithms are taken into account so that high volume senders, such as a large retailer who sends a daily newsletter to millions of subscribers, will not be rate limited at the same threshold as an individual sender on Gmail.
It's obvious to me that the major failure in this implementation was the the 'spam flood prevention' was enabled prior to any historical volume data being gathered. There was obviously some default starting threshold in play when the feature was turned on which caused legitimate emails to be blocked when the number of messages was relatively high. The correct way to turn this feature up would be to allow it to run for a week or a month in 'learning mode' so that it has a baseline from which to determine what is a normal volume of messages for each sender.
one of my clients is unable to recieve any mail from anyone with an aol address. they are all getting that same connection limit error.
i like the idea of the rate limit protection so id like to leave it in place to see if it does cut back on spam, but we're losing too much incoming mail right now because of it....
We have also experienced the same issue but with only one sender. Diagnostic information states that the mxlogic.net server basically denied the message based on too many connections...? Question is why would I disable that feature? Can it be adjusted to accommodate the environment and it's "abnormal connections"?
For the forum archives:
We never did find anyone at McAfee to talk to about this, it just magically fixed itself around 7pm east on Tuesday. They lifted the limit and we quickly dumped tens of thousands of queued emails to McAfee email security customers, which I'm sure was extra fun for those affected on Wednesday morning.
All in all this was a pretty terrible experience. I'd love to see a write-up from McAfee about what happened.