We are having problems with our Ironmail not delivery emails with medium to large attachment sizes (5MB+).
Here is what I'm seeing in the logs. It looks like the DATA portion of the connection is created but after 5 mins or so the connection is closed. What I don't know is if this is an issue with the Ironmail not sending the data, or the backend smtp not receiving the data.
Of note: internally we do NOT go through Ironmail, and not noticed any delivery issues (internal smtp is postfix).
Version 6.7.2 Hotfix 6.
20111103:00:22:57|1373391|9474|Channel outbound flag -|0|
20111103:00:22:57|1373391|9475|Max retry attempts -|4|
20111103:00:22:57|1373391|9476|Starting to process msgid -|1373391|
20111103:00:22:57|1373391|9481|Processing Domain -|domain1.com|
20111103:00:22:57|1373391|9515|DNS Lookup Returned -|[(1, 'smtp.domain1.com', ('smtp.domain1.com',))] fromCache=False|
20111103:00:22:57|1373391|9516|Connecting to Domain -|domain1.com|
20111103:00:22:57|1373391|9487|Block timeout in seconds -|300|
20111103:00:22:57|1373391|9488|Connecting to MX -|smtp.domain1.com|
20111103:00:22:57|1373391|9489|Connecting to A -|smtp.domain1.com|
20111103:00:22:57|1373391|9491|Channels Vip vipid:bindhost -|0:18.104.22.168|
20111103:00:22:57|1373391|4099|Connecting to <BindHost:ConnectHost:ConnectPort> -|<22.214.171.124:smtp.domain1.com:25>|
20111103:00:22:57|1373391|4139|-|Reply: '220 msg.domain1.com -- Server ESMTP (Sun Java System Messaging Server ))'|
20111103:00:22:57|1373391|9492|Connection Status <status> -|1|
20111103:00:22:57|1373391|4139|-|Sending: EHLO domain1.com|
250 SIZE 0'|
20111103:00:22:57|1373391|9523|Starting SendSmtpMsg in domain -|domain1.com|
20111103:00:22:57|1373391|9570|BATV values are DSN_BVP_enable: <IsEnabled> mail_from: <Mail From> mdoutbound <IsOutbound> selfdeliveryMode <Delivery Mode> -|0:firstname.lastname@example.org:0:0|
20111103:00:22:57|1373391|4139|-|Sending: MAIL FROM:<email@example.com> size=32945431|
20111103:00:22:57|1373391|4139|-|Reply: '250 2.5.0 Address and options OK.'|
20111103:00:22:57|1373391|4139|-|Sending: RCPT TO:<firstname.lastname@example.org>|
20111103:00:22:57|1373391|4139|-|Reply: '250 2.1.5 email@example.com OK.'|
20111103:00:22:57|1373391|4139|-|Sending: DATA |
20111103:00:22:57|1373391|4139|-|Reply: '354 Enter mail, end with a single ".".'|
20111103:00:27:57|1373391|9563|Exception occurred: Type=<error type> Exception=<exp> -|<class 'ct_smtplib.SMTPServerDisconnected'>:Socket error sending DATA|
20111103:00:27:57|1373391|9526|No DSN to be generated for this message.||
20111103:00:27:57|1373391|9561|Delivery failure. <Notification message id> : <Retry Count> -|4|
20111103:00:27:57|1373391|9506|Closing SMTP Connection||
20111103:00:27:57|1373391|9480|Finished processing msgid -|1373391|
my first thought would be that it was a size issue, however I see that shouldnt be the case with the ehlo response. A packet capture would certainly shed some light here. For a timeout to occur would mean that we are not recieving any packets for 5 minutes which could be a network issue.
Seeing that this occured after the DATA command can mean a number of things that can range from an appliance thats between the ironmail and mail server running as a transparent bridge and applying some sort of policy on the contents of the message to a poor network or mis-matched ethernet (half-duplex).
These are the steps I would take:
1.) You can try running the console command "show network interface" to determine the ethernet settings:
[McAfee]: show network interface
ID Intf Type Address Netmask mtu
-- ---- ---- ------- ------- ---
Interface-1: (MAC Address: a4:ba:db:46:f8:9c Status: active)
Media: (Ethernet autoselect (1000baseT <full-duplex>))
1 1 PRIMARY 10.10.10.81 255.255.192.0 1500
(Assigned To: WEBADMIN, CLI, VH_SMTPO_INBOUND, VH_SMTPO_OUTBOUND, VH_SMTPI_INBOUND, VH_SMTPI_OUTBOUND, SWD)
2 1 ALIAS 10.10.10.82 255.255.255.255 N/A
(Assigned To: IWM_PORTAL)
3 1 ALIAS 10.10.10.83 255.255.255.255 N/A
(Assigned To: EUQ)
2.) Make sure that the ironmail has a direct connection to your mail server. If you dont have a network diagram you can run a packet capture from your mail server to determine if the mac address from the ironmail is the same that you see after you run the command above.
3.) Check the settings of your mail server, it could be set to accept and then drop a message at a certain point.
The key thing to remember here is that it is failing after the DATA command, so at this point the packets will grow to be about 1500 bytes (depending on the network) and there is now content that can be analyzed for policy violations.
We've had these Ironmail appliances (2) in the current setup with the mail system for a few years. I'm 100% confident that no config changes have occurred to the backend mail server. I'm also confident that isn't any sort of transparent bridge between the devices.
That being said, we did upgrade to Hotfix 6 last week, which has been the only change that we've done to any of the mail components. Is there any changes to the timeout on the Ironmail device (ala now 5 mins used to X mins) etc.
Some attachments actually come through fine (possibly a decent percentage) but a lot are being left in the outbound queues.
I've been trying to find documentation on how to do a packet capture with in the cli.
I know its "capture network traffic" I can select the interface but I have no idea about the next few questions (other than size, the others are 'options' and 'expressions' or something like that). Any ideas?
Message was edited by: kollross on 11/8/11 10:55:06 PM CST
It seems emails with large attachments are being stuck inbound. We receive Alert ERROR: Service <SMTPO> Cause: <ERROR> every 5 minutes. The messages eventually get sent, but some are taking days. I can go in to the gui, search the queue, and release the emails... but these messages shouldn't be taking this long to leave the queue. I've been on the phone with McAfee for hours and the last call they said the email server was taking a long time to accept the messages. Our email admins report everything looks normal. The only thing that changed is the HF6 and HF7.
At: Fri, 13 Jul 2012 13:28:17
Info: Unchanged Domain: 3 msgs (id:36582485 36596446 36596448 ) idle in XXXXX.com between 35min 40sec and 38min 30sec
The solution was to set the nic to full duplex. We're not sure if the hotfixes changed nic setting. I only ended up discovering this because I resorted to checking basic settings. McAfee support was little help. They applied some support script that was supposed to fix the issue we were having, but this didn't work either.