When the pings fail, after a small amout of time ( max 10 secs ),do the pings work again ?
When the unit populates the firewall rules due to many things, the device will fail to pass traffic for a few seconds, which you may see with pings, but not with other traffic due to retransmissions
It is intermittent and does resume after a few seconds.
> When the unit populates the firewall rules due to many things,
> the device will fail to pass traffic for a few seconds, which you
> may see with pings, but not with other traffic due to retransmissions
Also UDP traffic (e.g. RADIUS) would be dropped. This is giving us some QoS problems.
A few questions:
1. In between 2 failures I see a number of successful pings coming in over our IPSEC VPN connection. Since that is trusted traffic is it bypassing the rules and successfully being passed to the LAN port even though the rules are being populated or does this mean we have a different problem?
2. I have also gotten some ping failures over the IPSEC VPN... none of them since I started packet captures so I don't have a capture to look at to see what was going on when those failed but if IPSEC traffic is passed even when the rules are being populated then at least these failures are something different.
3. How often would this populating of rules happen?
4. I'm using SecureComputing/SG570 Version 3.1.5u4 is there anything I can do to minimize the impact of this behavior? Faster device? Maybe newer firmware that doesn't block traffic when loading the rules?
1. this would be due to buffering the pings until they can be sent. But the buffer is of limited size.
2. All traffic, including VPN traffic will be interrupted for a short time....ideally all waiting in the buffer to be processed. If you do have a seperate issue here, we can investigate further with diagnostics
3. The population occurs whenever changes are needed to be made to the firewall rules. So that includes config and setup. But once operational, it should only be when an interface is bought up/down, including VPN interfaces.
4. Options include reducing the loading by minimizing packet filter rules, not using DNS names in rules, use a dedicated VPN unit if there are many PPTP VPN's connecting and disconnecting, or upgrading to a faster unit as a last resort. The current firmware works the same.
I assume this is impacting your services ?
in which way ?
This is impacting services in a couple of different ways: (a) dropped RADIUS (UDP) packets that then have to wait for a timeout and resend and (b) customer perception from the problem.
I don't think the loading of rules is the problem. None of the interfaces are going up or down when the packet loss occurs (including any VPN tunnels). I even checked syslog to see if IPSEC had new keys being negotiated and there are not.
Today I had the same problem but instead of over eth1 it was across the IPSEC tunnel where I can see the ping come in ipsec0 interface but never get sent out eth0 interface.
A couple of weeks ago (before we had the capture sorted out) we reset the firewall and had no alarms for a solid week. I'm curious if I did a reset if the alarms would go away for a while again but don't want to do that until I hear from you in case there is some diagnostics you want to see first.
If it interrupts RADIUS packets that would indicate a VPN session is coming up, and as such firewall rules reloaded....at a guess.
I suggest you send in a TSR generated when the issue occurs....it may well be easy to spot from there.
Thanks for your response. I want to make sure I understand this correctly:
1. Any time an interface comes up the packet rules are reloaded. This could take 2-10 seconds depending on the number of rules involved. Correct?
2. An interface coming up includes all (IPSEC, PPTP, etc.) VPN interfaces. Correct?
3. When an IPSEC VPN that is already connected does a phase 1 or 2 rekeying does that reload the rules also?
I've blocked all PPTP until I can sort this out and have everyone only connecting from IPSEC connected networks.
Here's a scenario that causes me to question the idea that my dropped packets are from loading of rules:
time 09:27:02.912008 successful ping over eth1
time 09:27:02.997008 failed ping request over eth1 (never sent out eth0 to server)
time 09:27:03.262508 successful ping over ipsec0
So I have a ping request that came in eith1 but was never sent out eth0 to the server with successful ping over eth1 a fraction of a second before and a successful ping over ipsec0 a fraction of a second after. If the problem was loading of packet filtering rules would we see this? Wouldn't we have a longer window of failed packets?
BTW how soon after a failure do you need the TSR or would giving it to you now be just as useful (last failure 14 hours ago)?
I can't post it here so what email address should I send it to?
Generating the TSR as soon as you can after the issue is best, since the UTM
has limited storage for logs.
Correct, we can't post TSR's here.
You will need to submit it to support via
mention this thread in the ticket.
1. Correct, but 20 secs would be abnormal.
You can tell the runtime from the command line by entering this command at the command prompt
and timing how long that takes to run. Then subtract 2 secs as it as a sleep of 2 secs coded in.
3. No, Firewall rules do not need to be reloaded on a rekey of IPSec
As mentioned, a TSR is really required as I can keep hypothesizing what the issue may be, or I can see the TSR and no doubt identify the issue
Another factor that can affect some customer in busy enviroment is the Flood rate limiting which can be disabled via
Firewall -> Connection Tracking -> Enable Flood Rate Limiting