O.K., that's interesting to know (works without NAT) but I need it to work WITH NAT. I can't scrap my NAT boxes and only work from public IPs all the time. Especially when ever other device (even SG580s without 4.x firmware) works perfectly from behind the NAT.
Until McAfee comes up with a fix for this I still have to freeze deployment of 4.x firmware AND be very wary of new UTM deployments.
Here's my output to the syslog with debugging enabled for the PPTP Server:
Jan 12 13:35:19 pptpd: CTRL: Client X.X.X.X control connection started Jan 12 13:35:20 pptpd: CTRL: Starting call (launching pppd, opening GRE) Jan 12 13:35:20 pppd: pppd 2.4.4 started by root, uid 0 Jan 12 13:35:20 pppd: Using interface ppp0 Jan 12 13:35:20 pppd: Connect: ppp0 <--> /dev/pts/0 Jan 12 13:35:23 packet: Invalid - dropped: IN=eth1 MAC=00:1e:be:ff:2d:05 FAMILY=00 Jan 12 13:35:50 last message repeated 1 time(s) Jan 12 13:35:50 pppd: LCP: timeout sending Config-Requests Jan 12 13:35:50 pppd: Connection terminated. Jan 12 13:35:50 pptpd: CTRL: EOF or bad error reading ctrl packet length. Jan 12 13:35:50 pptpd: CTRL: couldn't read packet header (exit) Jan 12 13:35:50 pptpd: CTRL: CTRL read failed Jan 12 13:35:50 pptpd: CTRL: Reaping child PPP Jan 12 13:35:50 pppd: Modem hangup Jan 12 13:35:50 pppd: Exit. Jan 12 13:35:50 pptpd: CTRL: Client X.X.X.X control connection finished
Not much to go on. This has to get fixed ASAP.
This syslog entry
Jan 12 13:35:50 pppd: LCP: timeout sending Config-Requests
means the UTM device is not getting any response back to its GRE packets it is sending.
This is usually caused by a NAT device not doing NAT correctly. KB62307 has this info which I will post here due to the frequency of this issue:
SummaryNAT devices can create challenges for a successful and reliable connection for PPTP VPN sessions. To understand the problem with PPTP and NAT, we must first understand how PPTP works in further detail.
PPTP clients initiate all connections using TCP port 1723 which allocates session IDs for the particular connection. Once the initial handshaking is done using this port, communications switch to GRE (protocol number 47). The initial session IDs that were setup by the TCP 1723 control connection are maintained. Then, the TCP 1723 control connection is only used for echo requests and echo replies between the client and server so that both ends know the remote end is still alive. The TCP 1723 control connection is also used to terminate the VPN connection.
To perform NAT correctly for PPTP connections, the NAT device must be able to associate a GRE protocol connection with the correct client or server by inspecting the IDs contained in the GRE packets. It also must perform NAT on the IDs to ensure they are unique because multiple PPTP connections from different clients or servers may use the same IDs.
Simple NAT devices will only use the IP addresses of the initiating client and the destination server when they perform the NAT translation and will not inspect the IDs inside the GRE packets. These devices perform NAT correctly when there is only one PPTP session from any client or to any server. However, when there are multiple PPTP sessions, these simple NAT devices are not able to associate the correct GRE connection with the correct client or server. They will send the 2nd session's GRE packets to the incorrect client or server, as per the NAT translation table for the 1st session.
The UTM Firewall firewall is able to inspect the session IDs inside the GRE packets and will correctly perform NAT for PPTP sessions.
Well, o.k., but why did this change (for the worse, it appears) with firmware 4.x on the SG's? I never had (and still don't have) this problem with SGs running 3.x firmware.
Also, even with no other PPTP sessions running I get the same error trying to PPTP to my SG580 with the 4.x firmware.
Would it help if I set up port forwarding on my NAT box to just send all port 1723 traffic to my desktop computer?
Nothing has changed in this regard from version 3 to 4.
The syslog message I pointed out also exists in version 3 for exactly the same reasons
The document I posted explains why this occurs, even with no other sessions running, and why it appears 'randomly'
As per the document above, it is the GRE packets you/your NAT device need to forward, not tcp 1723
or you could reboot your NAT device, assuming it is capable of knowing what to do withthe GRE packets at least on a basic level.
O.K., but again....PPTP worked perfectly for YEARS before the upgrade to 4.x and it still works perfectly on a half-dozen other SG devices that are running 3.x firmware. So something changed. I've already rebooted the NAT device to no avail.
Next time I'm at a different location (rare) I'll try to connect from there to take this NAT device out of the equation.
That error message was posted from my attempted connection to a SG310 running 4.0.5 that would not work behind NAT. With a public IP address it works fine. When that firewall was running 3.2.2 I could connect just fine from behind another firewall, now suddenly I cannot since upgrading to 4.0.X.
The other customer was running 3.2.2 and could connect from all over the place (he travels around the US) and now since upgrading to 4.0.5 he cannot connect from anywhere Nat'ed.
This worked perfectly before, why did it change?
After supporting the product for nearly 6 years I can confidently say nothing has changed and I see this issue often.
It is the nature of the problem that has this 'random' failure.
It is not random at all, but appears that way, and we try to make sense of it in ways such as what firmware version are we running.
If you do a packet capture you will see the UTM Firewall sending GRE packet and getting no reply. Likewise if you do a packet capture on the client you may see it sending the GRE packets, but these packet never arrive at the other end. You may also see on the client that it never gets the GRE packets from the UTM device. The UTM device send them, but they never arrive.
It's definitely not random, I can connect to 3.X just fine, 4.X not at all behind NAT
I don't know what to tell you, Ross. This desktop machine has about two dozen VPN connections (almost all PPTP) listed - never more than 1 or 2 active at a time. I log into and out of them on a regular basis to do various maintenance tasks on those remote networks. Of those two dozen connections about 6 or 7 are SG580s. The rest are an assortment of things from SonicWalls to WatchGuard and others.
The NAT box I'm behind here is a Linksys WRT54G. I've been behind it for at least 3 years I'm sure. During that time I have PPTPd to those two dozen connections day in and day out and they've worked 99% of the time.
I have also spent countless hours PPTP'd into the SG580 in question from this desktop behind this NAT router. 99% success rate.
This desktop spends days on end connected via PPTP to a SonicWall TZ170 at my downtown office. 99% success rate. It's connected to that location right now in fact. Sometimes I'm VPN'd to that SonicWall from more than one machine behind this same NAT router at the same time. Works just fine, 99% of the time.
A couple of weeks ago I upgraded the SG580 to firmware 4.0.5 and suddenly...PPTP to that unit fails every time. 0% success rate. Every other SG580 I connect to (all of them running firmware 3.x) still works 99%. Last night I connected to an SG580 running the 3.x firmware and it worked perfectly; as it pretty much always does. Every other firewall from other vendors is 99%. That one unit, hasn't worked since the upgrade.
And not only doesn't the built-in PPTP server work, but it won't succssfully pass-thru PPTP traffic to an internal server running RRAS anymore either.
The only thing that changed was the firmware rev on the SG580 and now PPTP to that site fails 100% of the time.
That doesn't sound very random to me.