cancel
Showing results for 
Search instead for 
Did you mean: 

Web Gateway: Understanding NTLM and Windows Domain Membership

Introduction

The purpose of this article is to cover requirements, configuration, common issues and troubleshooting Active Directory (AD) NTLM domain communication on the Web Gateway (MWG). Being the most commonly used form of authentication, this is also meant to cover the most common questions and issues we experience in support, as well as making it easier to understand overall. This is not meant to cover authentication issues like intermittent authentication prompts.

 

 

Prerequisites

For NTLM authentication, the MWG must become a member of your AD domain. There are a few things you have to make sure are setup correctly for this to work:

1. Web Gateway must be able to connect to your AD server over TCP port 445 (no other ports are required).

2. For successful NTLM authentication the MWG needs both the IP address (for tcp level communication) and the Fully Qualified Domain Name (FQDN from here on) of the Domain controller (for SMB level communication). One of the two (either IP or FQDN) is provided in the MWG configuration. You have to ensure that the other one can be resolved by your DNS.

3. When initially setting up the domain membership on the MWG, a domain administrator account is needed so a computer account can be created in AD for the MWG. Keep in mind that the domain administrator account is only used for the MWG account creation on the domain and those credentials are not stored on the MWG.

 

 

Configuration

The first step in configuration is to join the Web Gateway(s) to the domain(s) that will be used to authenticate against. This is done within Configuration > [[ Appliance Name]] > Windows Domain Membership > Join

 configuration.png

1. Windows Domain Name: The AD domain (NETBIOS name) to which Web Gateway should be joined. In case you have issues determining the correct NETBIOS name, a helpful command to run from a command prompt in windows is nbtstat  -n and the 'GROUP' that's returned is the name of the domain that the computer is part of.

2. Gateway account name: This will be the name of the Web Gateway computer account that's added to Active Directory when it successfully joins the domain. After this account is created, it should not be modified, nor should it be created manually.

3. Overwrite existing account:  If checked, this will overwrite the existing Web Gateway computer name if it exists on Active Directory. Each Web Gateway will need a unique account (computer name) on Active Directory, so if a computer name has been used by another Web Gateway or computer, it will be overwritten. Keep in mind that if the same account is overwritten, the Web Gateway that was using it will no longer be part of the domain and will no longer be able to authenticate against it.

4. Use NTLMv2: It's recommended to use NTLMv2 if it's supported by your Active Directory environment. This option only enforces NTLMv2 for the Web Gateway while it is joining your AD domain.  It does not enforce NTLMv2 for client requests.

5. Timeout for requests: This is the amount of time that the Web Gateway will wait for a response from Active Directory before timing out. In case this timeout is reached, the domain controller in question will be flagged as down and we will fail over to the next one (if other DCs have been configured).

6. Reconnection Timeout: Time to wait before reconnecting to a domain after a failure.

7. Configured Domain Controllers: A comma separated list of Active Directory Domain Controllers that the Web Gateway should be using for this domain. It is suggested to use the fully qualified domain name (FQDN) since it's more likely to properly resolve (forward DNS -> Hostname to IP) than the IP address (Reverse DNS -> IP to hostname) of the Active Directory servers. You can leave this field blank to force the Web Gateway to perform auto discovery of your DCs.  Auto discovery is not recommended as it introduces more complex DNS requirements.  Hard coding the DCs is recommended for most environments.

8. Number of Active DCs: The total number of active domain controllers the Web Gateway will use for authentications. The Web Gateway will distribute authentication requests between the active DCs.  See 'Understanding 'Active' Domain Controllers, failover, & authentication request distribution' section below for add'l info.

9. Administrator account/password: The domain admin account and password used to create the computer account in AD. The account and password is not stored anywhere on the Gateway after it's used (Just like joining your windows PC to the domain)

 

Join the domain by clicking 'OK'.

 


MWG joined the domain successfully

After joining the domain, you'll want to see a consistently green status indicator in the GUI after selecting refresh, as seen below. If the status is red, there is an issue (see troubleshooting further down)

 

 

Additionally, if the account creation was successful, the computer name should be visible within Active Directory.

 

The best method to test user credentials after joining the domain is to see what is returned in an authentication test. The settings for authentication can be found under Policy > Settings >  Engines > Authentication > select your configured NTLM engine (or create one) > select the arrow next to 'Authentication Test' and test with your domain credentials. Here's an example of a successful and failed test.

 

Good credentials

good credintials .png

 

Bad credentials

If nothing has failed so far and your authentication tests were successful, you are ready to start deploying authentication policy for your users. More about that here:

 

 

Understanding 'Active' Domain Controllers, failover, & authentication request distribution

 

How Web Gateway finds active DCs and handles failover

In this example, there are 4 configured Domain Controller IP addresses, which we’ll refer to as DC1, DC2, DC3, and DC4, and the ‘Number of active Domain Controllers’ is set to 2.  Default timeout values are used.

 

Note:  Web Gateway tries to connect to up to 2 DCs. It doesn't connect to all 4 defined DCs simultaneously to select 2 DCs that answered first.  Rather, the DC list defines in what order Web Gateway will try to connect to the servers.  A DC is marked as offline for 3 minutes in case of a communication error (Ex: Web Gateway is not able to connect, or connection to the DC was aborted by a timeout (15 seconds)).

 

      • Web Gateway tries to connect to DC1 and DC2.  Connection to DC1 failed and DC1 is marked offline for 3 minutes.  Connection to DC2 is successful and DC2 is marked active.
      • Web Gateway looks for a second active DC and tries to connect to DC3 (next in the list).  Connection to DC3 is successful and DC3 is marked active.  Both DC2 and DC3 are active.
      • DC2 is no longer reachable and is marked offline for 3 minutes.  DC3 is still active.
      • Web Gateway looks for a second active DC and tries to connect to DC4 (next in the list).  Connection to DC4 failed and DC4 is marked offline for 3 minutes. No additional DC can be contacted right now (DC1, DC2, DC4 are all still within the 3 minutes offline status).
      • DC1 status changes to standby status (3 minutes offline status expired).
      • Web Gateway tries to connect to DC1.  Connection to DC1 is successful and DC1 is marked active.  Both DC3 and DC1 are active.
      • DC2 and DC4 status changes to standby status.  DC3 and DC1 remain the active servers until one or both go offline.

 

As described above, the ‘active’ domain controller(s) are sticky and DCs in standby status are not checked unless an active DC goes offline.   A restart forces Web Gateway to start over from the beginning to find active DCs.

 

Authentication request distribution

Authentication requests are distributed across the active DCs where the fastest DC (first available of the active DCs) handles the next request.

 

What if the number of DCs in active status are fewer than the specified number of active DCs?

In an example with 3 configured Domain Controllers and 2 active, if 2 DCs are offline and only 1 remains active, Web Gateway will attempt to reconnect to the offline DCs once they return to standby status in effort to find a 2nd active DC.  In an the case where all DCs are offline, all requests fail immediately until DCs return to standby status and Web Gateway is able to find an active DC.

 

 

Troubleshooting

Here are a few troubleshooting examples where the MWG did not join the domain successfully or it has issues communicating with the DCs.

 

Note that there are only two main troubleshooting tools:

 

1. The Web Gateway authentication debug log

2. A network capture/tcpdump taken on the Web Gateway (this will give you the most comprehensive troubleshooting data)

 

Authentication Debug Log

You can find the authentication debug log under Configuration > [[ Appliance Name]] >  Troubleshooting > Authentication Troubleshooting

 

The log files written can be found under under Troubleshooting > Log files > Debug > mwg-core_Auth.debug.log

 

There are two main options for the authentication debug log:

 

1) Log management events

We recommend to have this option permanently enabled. It will log all events that have to do with your AD connection, joining or leaving the domain or failing over from one DC to another. Very little log data is being written, which allows you to always have this option enabled.

 

2) Log authentication events

We recommend that you only enable this option for specific troubleshooting, limit it to a specific IP and disable it again as soon as possible after replicating an issue. This logging option will log all events related to actual user authentications. As you can imagine it will grow fast when enabled as not only every authentication request from a client but also group memberships and so on are being logged. It is most useful if you have specific clients that constantly get prompted for credentials or if they simply cannot login at all. Enable the authentication event option and specify the client IP address that will be replicating the problem (for example open the browser and get a prompt). Right afterwards disable the authentication event option again so the log does not grow to a point where it becomes a problem.

 

 

 

Tcpdump

You can take a packet capture (tcpdump) from the Web Gateway UI or from the command line (recommended option) as 'root':

 

Command Line: (ssh or console access)

cd /opt

tcpdump -i any -s0 -w ntlmcapture.cap port 445 or port 53

 

Reproduce Problem and let capture run for at least 3 minutes. (this is the default timeout value in which MWG attempts to reconnect to a DC)

Stop capture. (Ctrl +c)

 

File will be present in the directory (/opt) in which you ran the command.

 

MWG UI: (Troubleshooting > Packet Tracing)

Add these command line parameters:

-s0 -i any port 445 or port 53

 

Start Capture.

 

Reproduce Problem and let capture run for at least 3 minutes. (this is the default timeout value in which MWG attempts to reconnect to a DC)

Stop Capture.

 

You can view created traces on your desktop with the free tool "wireshark".

 

Below are a few examples of what you might see:

 

No IP address (Forward DNS failed)

In this example, I tried to join to the Active Directory server by providing the FQDN bob.jimc.local in the MWG UI (see field 6 above) but there is no DNS record for this name, so DNS returns 'No such name.'

Joining the domain will fail immediately

 no IP.jpg

No or incorrect hostname (reverse DNS failed)

In this example, I tried to join via IP of 10.10.95.12 which has no reverse record in DNS (or an incorrect hostname is returned). The MWG can estabish the TCP connection to the DC as it has the IP address provided in the UI, but once the TCP connection is established and the protocol switches to SMB, the connection fails as the correct hostname is required.

 

A similar situation applies when the domain controllers are being load balanced via a virtual hostname. For example, if you provide the FQDN of DCpool.company.com (virtual name for load balanced DCs) and it resolves to the IP of one of your DCs (for example dc1.company.com), your connection will fail because as soon as the protocol switches to SMB, the hostname provided is DCpool.company.com and not the expected/correct hostname of dc1.company.com. Do NOT use virtual hostnames for your DCs. Use the real hostnames and let the Web Gateway do the load balancing for you.

yellow.png

 

Bad admin credentials

In this example, the credentials for the administrator used to join the domain were not valid.

join_attempt_bad_password.jpg

 

 

Computer account deleted or disabled

In this example, the computer account for the Web Gateway was deleted in AD, but the same error could also be thrown if the account is disabled/modified. Also note that the error message is the same as when the incorrect administrator credentials were used while trying to initially join the domain.

ad_account_deleted.jpg

 

 

'Logon To' Account Permissions in Active Directory

When you join the Web Gateway to the domain, a computer account is created within Active Directory.  When Web Gateway talks to the Domain controller to authenticate users, it uses this computer account.

 

Some users in Active Directory may have restrictions as to which workstations they are able to logon to. If the user is only allowed to logon to specific workstations, you will need to make sure the Web Gateway computer account is also added as an allowed workstation.  Failure to do so will cause authentication to fail and the user will be prompted to authenticate.

 

In this scenario, the Web Gateway is joined to the domain with computer account 'WebGateway'.  The user 'user1' is only able to logon to workstation 'Desktop1'.

 

Web Gateway Domain Membership.png  AD Logon Workstations.png

 

 

The example below shows the Web Gateway trying to authenticate 'user1' using the computer account 'WebGateway'.  The domain controller responds with an error message indicating that authentication failed.  The error the Domain Controller sends is STATUS_INVALID_WORKSTATION as seen in the screenshot below.

 

WebGateway Computer Name TCP Dump.png

 

 

It is important to add the web gateway's computer account into the user's allowed workstations, or to allow the user to logon to all workstations for this to work properly.

 

Web Gateway in Logon To Section.png

 

 

Alerting

If you would like to get notifications in case issues arise with your domain membership, you can utilize some of the dash board alerts the MWG produces.Please see the following article on incident alerting:

dash_error.jpg

 

 

Last resorts

 

Hosts file entry

If DNS issues cannot be overcome (temporarily or permanently), an entry into the hosts file of each Web Gateway will likely be required. It is required to change this in the GUI as seen below (do not make /etc/hosts changes on the command line)

 host file entry.png

Rolling captures for intermittent issues

Log into the Web Gateway with a tool like putty as the 'root' user. Browse to /var (cd /var) and verify that you have enough free space to store the captures using 'df -k'. With the syntax I've provided, you will need 2GB of free space on var, but that can be changed, keeping in mind that if you reduce how many captures will be stored by too much you may have the worthwhile tcpdump deleted before you stop the rolling capture.

 

nohup tcpdump -Z root -s 0 -i any port 445 or port 53 -C 100 -W 20 -w capturefilename.pcap & <press enter twice>

 

-C is how large the capture can be before a new one is started in MB

 

-W is how many captures will be stored before the oldest is deleted for a new capture to start.

 

-port 445 is for active directory and 53 is for dns

 

-the other parameters should remain unchanged

 

To stop the capture, run 'ps aux | grep tcpdump' and get the process ID for the rolling capture, then run 'kill -9 processID' to stop the rolling capture. The completed captures will be in /var/empty/

 

 

Takeaways

    • Always hardcode the 'Configured Domain Controllers' field with the address of your Domain Controllers. Do NOT use a Virtual Hostname.
    • MWG needs both the IP and FQDN of the configured Domain Controller. You'll specify one in the field provided; the other needs to be resolved by DNS.
    • Remember to enable the 'Log Management Events' debugging option.
Labels (1)
Comments

Great article with details and packet dumps as it should be! Great job, thanks!

When initially setting up the domain membership on the MWG, a domain administrator account is needed so a computer account can be created in AD for the MWG. Keep in mind that the domain administrator account is only used for the MWG account creation on the domain and those credentials are not stored on the MWG.

Added Sections "Understanding 'Active' Domain Controllers, failover, & authentication request distribution" and "'Logon To' Account Permissions in Active Directory".

Thanks Guys. Very Good Modified.

We are in a forest with a Master domain and multiple Child domains.  Do you add MWG 7.x to one child domain and it will auth users from any domains or is it different from the behavior of WebGateway 6.x?

Hi DBO,

Same behavior as in 6.x, 7.x will utilize the trust between domains to authenticate other domains' users. However I would recommend joining MWG directly to the remote domains such that you eliminate the middle man.

Best Regards,

Jon

Thank you, I was reading/scanning the documents and manuals but wasn't able to find an answer to that...  As you may have guess, 6.9.x is still our active proxy...  Time to move on!!!

Hi Guys,

perfect summary. :-)

Cheers

Hi

I run into a problem.

We have 2008 r2 AD server.

WebGateway is registering in AD. Computer account is created.

Just after adding to AD status is green. But after refresh it becomes red.

In log I have laconic: [3851] NTLM: Failed to connect to DC @@@ in domain ###

name (@@@) of DC is resolved by ping.

domain name (###) is in NETBIOS format (without dot)

I run wireshark on DC and in end I got:

RPC_NETLOGON 262 NetrServerReqChallenge request, webgateway1

RPC_NETLOGON 162 NetrServerReqChallenge response, STATUS_INVALID_COMPUTER_NAME

name is exactly the same as is in AD.

I tired a lot different option and I faced wall. Any more ideas? 

The error "STATUS_INVALID_COMPUTER_NAME" is discussed in this same article in this troubleshooting section. Did you verify this yet?

I have a simple question. Based on the article that domain account and password used to integrate AD to MWG is not stored. How will we be able to know what domain account is used then? This is to avoid resetting the domain account's password used for the integration. As far as I know, password mismatch will result to loss of communication to AD, right?

Hi,

you do not need the account any more. The account is only used to join MWG to the Domain. After a successfull Domain join, a computer object (account) is generated in your Active Directory for your MWG(s). The original account for Domain Join is not used any more.

Btw, MWG can be a member of several Active Directory Domains at the same time.

Tip, always use the FQDN Name for your domain controlers and not an IP.

Hope this helps,

Cheers

Hello,

interesting article with useful tips and tricks.

Any way to stop the AD connection to be sticky ?

Allow me to explain a bit.

We have a bunch of WGs deployed across many sites. In each site only one Domain Controller is available locally. All webgateways are domain-integrated.

To prevent authentication 'outages' on the proxy when the local DC goes down (basically once a moth due to Windows patches), we set up 2+ DCs in the Active Directory configuration, but only set 'Active DC' to 1.

One DC is local (=wire speed), the other one(s) are remote, so in fact suffer from latency.

Latency is important when talking about authentication, because the higher amount of ms the authentication takes, the longer the client (user) has to wait for pages/resources to be loaded.

We have noticed that 'Request Processing time' Authentication Statistics is usually 3 or 3.5 times the network latency (ping).

For example if you have a latency of 120ms between a Webgateway and a DC, you end up with a 360-420 ms latency, which is clearly noticeable by end users (slowness).

In our case, when the "local" DC goes down, the webgateway properly fails-over to the 2nd 'remote' DC. Latency will increase but this is acceptable in the case of an temporary issue / maintenance.

However, the WG connections to DC is sticky, means that the proxy will not use again the 1st proxy in the list. Therefore the proxy will continue to use a "laggy" DC and service delivered to end users becomes poor.

Of course we have the possibility to restart mwg services, or open AD configuration, edit Domain, click OK. This however requires manual intervention which we do not like

Would this sticky behavior be something that could be configurable from an admin point of view ? I understand that this behavior might be good for most customers but in our case as I explained thsi brings more issues.

Thanks for reading

Cheers

Jeremy.

Contributors
Version history
Revision #:
3 of 3
Last update:
‎03-20-2018 11:24 AM
Updated by: