cancel
Showing results for 
Search instead for 
Did you mean: 

Web Gateway: Understanding URL related Properties

Introduction

I have written the following guide to help understand the use of properties within rules, and how to formulate list entries to go with the corresponding rules. For example, a common question I get a lot is... I added [INSERT-SITE-HERE].com to [INSERT-LIST-HERE], but the site is still blocked---why isn't the whitelist entry working as expected?  Understanding the rule criteria is essential in managing the Web Gateway's rules and how they apply. This article will attempt to simplify some very common examples and explain use cases of certain properties. To start I will be focusing on URL based properties only.

 

 

Best Practices

If you read any piece of this document, please at least read this section. After you read this, you can use the "Good/Bad Examples" for further detail and reference. The below examples outline use cases for the most commonly used URL related properties.

 

URL.SmartMatch Purpose

The URL.SmartMatch property was created to allow for greater flexibility and usability. For example, URL.SmartMatch has relaxed syntax requirements whereas other properties, such as URL, URL.Host, URL.Domain, have specific syntax requirements and require the management of multiple lists--one list needed per property.  With URL.SmartMatch, those lists can be combined into a single list.  The URL.SmartMatch property will accept the list as "input", and return TRUE if the given URL, URL host, or URL path variation was found in the list.  In past experience we have found that customers often have to deal with multiple lists which have entries in differing formats.  The URL.SmartMatch property was created to help accommodate these variations thereby reducing the need to manage multiple lists.

 

URL.SmartMatch

The URL.SmartMatch property was introduced in version 7.4.1. Similarly to the URL.HostBelongsToDomains property it was designed to simplify the whitelisting process.

 

SmartList entries can be entered in the form of Host, Domain, URL, or Fragment of the URL. Example entries (wildcards "*" assumed on both sides of the entry):

    • host.domain.tld
      • Is equivalent to URL.Host matches *.host.domain.tld or host.domain.tld
    • domain.tld
      • Is equivalent to URL.Host matches *.domain.tld or domain.tld
    • http://domain.tld
    • domain.tld/path
      • Is equivalent to URL matches *.domain.tld/path* or domain.tld/path*
    • /path
      • Is equivalent to URL matches */path*

 

8.0.0_smartmatch.png

 

Good

Entries in Good: URL SmartMatch List

 

Entry: mcafee.com

Why it's good: Using this entry, it would correctly match for all mcafee.com subdomains, including mcafee.com, www.mcafee.com, secure.mcafee.com, etc...

 

Entry: mcafee.com/us/products/

Why it's good: Using this entry would allow content from the 'mcafee.com' domain, which includes the path of '/us/products/'.

 

Entry: http://mcafee.com

Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains.

 

Entry: http://www.mcafee.com/

Why it's good: Using this entry will allow only HTTP access to all 'www.mcafee.com'.

 

Entry: http://mcafee.com/us/products/

Why it's good: Using this entry would allow content from the 'mcafee.com' domain, which includes the path of '/us/products/'.

 

Entry: mcafee.com:80

Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80.

 

Entry: mcafee.com:80/

Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80.

 

Entry: http://mcafee.com:80/

Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80 if it's HTTP.

 

Entry: mcafee.com.

Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains.

 

8.0.1_smartmatch.png

 

Bad

Entries in Bad: URL SmartMatch List

 

Entry: /us/products/

Why it's bad: Using this entry could potentially match on other hosts which contain the path '/us/products/', example: http://maliciousdomain.mwginternal.com/us/products/.

 

Entry: http://www.mcafee.com:8080/

Why it's bad: This entry would not match because request is for port 80, however entry has port 8080.

 

Entry: http://download.mcafee.com/

Why it's bad: Subdomain is 'www' not 'download'.

 

Entry: *.mcafee.com

Why it's bad: Wildcards are not used in URL.SmartMatch entries.

 

Entry: *.mcafee.com/*

Why it's bad: Wildcards are not used in URL.SmartMatch entries.

 

Entry: .mcafee.com

Why it's bad: Leading period causes entry to not match.

 

8.0.2_smartmatch.png

 

 

 

 

Example URL Breakdown

 

Example URL

http://www.mcafee.com/us/products/web-gateway.aspx

 

The following shows examples of how the Example URL above could be whitelisted when using various properties.  Please notice the syntax flexibility of the URL.SmartMatch property.

 

URL.SmartMatch (example entries)

mcafee.com

http://mcafee.com

http://www.mcafee.com/

http://mcafee.com/us/products/

/us/products/

mcafee.com/us/products/

 

URL

http://www.mcafee.com/us/products/web-gateway.aspx

 

URL.Host

www.mcafee.com

 

URL.Domain (7.4+)

mcafee.com

 

URL.Host.BelongsToDomains (example entry)

mcafee.com

 

URL.Protocol

http

 

URL.Path

/us/products/web-gateway.aspx

 

 

Operator importance

 

is in list

Use of "is in list" implies exact string match. Wildcard characters will be interpretted as literal strings.

 

matches in list

Use of "matches in list" allows for wildcard matches. Although wildcard characters are accepted, they are not completely necessary.

 

 

Good/Bad Examples by Property

The following examples below are listed by property used in the rule along with the corresponding operator.

 

URL using "is in list"

Using the property "URL", implies that you will create list entries which take into account the full URL. Using the operator "is in list" implies an exact string match.

 

2.0.0_url_isinlist.png

Good

Entries in "Good: URL String List"

 

Entry: http://www.mcafee.com/us/products/web-gateway.aspx

Why it's good: Full URL is used as it is needed due to "is in list" operator.

 

2.0.1_url_isinlist.png

 

Bad

Entries in "Bad: URL String List"

 

Entry: www.mcafee.com/us/products/web-gateway.aspx

Why it's bad: The entry doesn't include the protocol information (http://). The URL property evaluates the full URL and the operator "is in list", implies exact string match.

 

2.0.2_url_isinlist.png

 

 

URL using "matches in list"

Using the property "URL" implies that you will create list entries which take into account the full URL. Using the operator "matches in list" allows for wildcard matches.

 

2.1.0_url_matchesinlist.png

 

Good

Entries in "Good: URL Wildcard List"

 

Entry: http://www.mcafee.com/*

Why it's good: This entry contains a trailing wildcard which will allow any HTTP request to www.mcafee.com. However, it will not match on requests for http://mcafee.com/.

 

Entry: regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.com(\/.*|\/?))

Why it's good: This entry is a bit more complex as it uses regular expressions. This entry will allow any request, HTTP or HTTPS, to mcafee.com and it's subdomains.

 

Entry: regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.(com|co\.uk)(\/.*|\/?))

Why it's good: This entry is the same as the previous entry but demonstrates how you can allow other top level domains, such as '.com' or '.co.uk'.

 

MOVED TO BAD (thanks to for pointing out the error)

Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.com(\/.*|\/?))

Why it's good: This entry is a bit more complex as it uses regular expressions. This entry will allow any request, HTTP or HTTPS, to mcafee.com and it's subdomains.

 

MOVED TO BAD (thanks to for pointing out the error)

Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.(com|co.uk)(\/.*|\/?))

Why it's good: This entry is the same as the previous entry but demonstrates how you can allow other top level domains, such as '.com' or '.co.uk'.

 

2.1.1_url_matchesinlist.png

 

Bad

Entries in "Bad: URL Wildcard List"

 

Entry: *.mcafee.com*

Why it's bad: Using this entry, the entry could match on another string within the URL, for example: http://malicious-download-site.mwginternal.com/malicious-file.exe?url=www.mcafee.com

 

Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.com(\/.*|\/?))

Why it's bad: This entry is a bit more complex as it uses regular expressions. This entry will allow any request, HTTP or HTTPS, to mcafee.com and it's subdomains. However, the entry could match on another string within the URL, for example: http://malicious-download-site.mwginternal.com/malicious-file.exe?url=www.mcafee.com

 

Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.(com|co.uk)(\/.*|\/?))

Why it's bad: The entry could match on another string within the URL, for example: http://malicious-download-site.mwginternal.com/malicious-file.exe?url=www.mcafee.com

 

2.1.2_url_matchesinlist.png

 

 

URL.Host using "is in list"

Using the property "URL.Host" implies that you will create list entries which take into account only the domain portion of the URL. Using the operator "is in list" implies an exact string match.

 

3.0.0_urlhost_isinlist.png

 

Good

Entry in "Good: URL.Host String List"

 

Entry: www.mcafee.com

Why it's good: The domain of the requested URL is 'www.mcafee.com' which is an uses exact string match.

 

3.0.1_urlhost_isinlist.png

 

Bad

Entries in "Bad: URL.Host String List"

 

Entry: mcafee.com

Why it's bad: The entry value is incorrect (mcafee.com), the actual property value is 'www.mcafee.com'.

 

Entry: *.mcafee.com

Why it's bad: The operator is "is in list" which implies an exact string match, wildcards will not match.

 

Entry: *.mcafee.com/us*

Why it's bad: The URL.Host property is limited only to the domain portion of the URL, not the path (/us). In addition, the operator "is in list" which implies an exact string match, wildcards will not match.

 

3.0.2_urlhost_isinlist.png

 

URL.Host using "matches in list"

Using the property "URL.Host" implies that you will create list entries which take into account only the domain portion of the URL. Using the operator "matches in list" allows for wildcard match.

 

3.1.0_urlhost_matchesinlist.png

 

Good

Entries in "Good: URL.Host Wildcard List"

 

Entry: mcafee.com

Why it's good: This entry will not match for 'www.mcafee.com' but if you intend to allow access to mcafee.com (no www) you will need it unless you use regular expressions.

 

Entry: *.mcafee.com

Why it's good: This entry will match on any subdomain of mcafee.com (but not actually mcafee.com itself).

 

Entry: regex((.*\.|\.?)mcafee\.com)

Old (bad) entry: regex((.*\.|\.?)mcafee.com) -- (Thanks to for pointing out the error)

Why it's good: This single entry uses regular expressions and will allow both mcafee.com and any subdomains of mcafee.com.

 

3.1.1_urlhost_matchesinlist.png

 

Bad

Entries in "Bad: URL.Host Wildcard List"

 

Entry: *.mcafee.com*

Why it's bad: Using this entry, the entry could match on another string within the URL, for example: http://www.mcafee.com.malicious-download-site.mwginternal.com/

 

Entry: *.mcafee.com/us*

Why it's bad: URL.Host property is limited only to the domain portion of the URL is acceptable, not the path (/us).

 

3.1.2_urlhost_matchesinlist.png

 

URL.Domain vs. URL.Host.BelongsToDomains

The URL.Domain property was introduced in 7.4. It was a property designed to be more consistent with other URL related properties (URL.Host, URL, etc...). It acts nearly identically to that of URL.Host.BelongsToDomains, but does not require a list as a setting, instead the list can be the operand.

 

URL.Domain is a string property which contains the top level domain of the requested URL (i.e. "mcafee.com").

7.0.0_urldomain_isintlist.png

 

URL.Host.BelongsToDomains<ListName> is a boolean property which returns true if the URL's top level domain is in the list specified on the rule (ListName). If the domain of the URL is not in the list, the property returns false.

7.1.0_urlhost_belongs.png

 

URL.Domain using "is in list"

Using the property "URL.Domain" implies that you will create list entries which take into account just the top level domain of the URL. Using the operator "is in list" implies an exact string match.

 

6.0.0_urldomain_isintlist.png

 

Good

Entries in "Good: URL.Domain String List"

 

Entry: mcafee.com

Why it's good: URL.Domain will simply equal "mcafee.com".

 

6.0.1_urldomain_isintlist.png

 

 

Bad

Entries in "Bad: URL.Domain String List"

 

Entry: www.mcafee.com

Why it's bad: URL.Domain is "mcafee.com", not "www.mcafee.com". Use URL.Host instead.

 

Entry: *.mcafee.com

Why it's bad: URL.Domain equals "mcafee.com", so "*." would prevent matching. "is in list" implies a string, not a wildcard.

 

6.0.2_urldomain_isintlist.png

 

 

URL.Domain using "matches in list"

Using the property "URL.Domain" implies that you will create list entries which take into account just the top level domain of the URL. Using the operator "matches in list" allows for wildcard matches.

 

6.1.0_urldomain_matchesintlist.png

 

Good

Entries in "Good: URL.Domain Wildcard List"

 

Entry: regex(mcafee\.(com|co\.uk))

Old Entry: regex(mcafee.(com|co.uk)) -- (Thanks to for pointing out the error)

Why it's good: URL.Domain equals "mcafee.com" so it will match. "mcafee.co.uk" will also match.

 

6.1.1_urldomain_matchesintlist.png

 

 

Bad

Entries in "Bad: URL.Domain Wildcard List"

 

Entry: *.mcafee.com

Why it's bad: URL.Domain of "mcafee.com" will not match due to the "*.".

 

Entry: *mcafee.com

Why it's bad: It will match on "mcafee.com", BUT it could match on "maliciousdomainmcafee.com" too.

 

6.1.2_urldomain_matchesintlist.png

 

 

URL.Host.BelongsToDomains

The URL.Host.BelongsToDomains property was introduced in 7.2. It was designed to simplify the complexity of adding list entries. Using the property "URL.Host.BelongsToDomains" allows you to simply enter the domain of interest.

 

So if you wish to white list all mcafee.com sites (including subdomains), you can simply enter mcafee.com, there is no need to worry about wildcards.

 

4.0.0_urlhost_belongs.png

Good

Entries in "Good: Only Domain List"

 

Entry: mcafee.com

Why it's good: Using this entry, it would correctly match for all mcafee.com subdomains, including mcafee.com, www.mcafee.com, secure.mcafee.com, etc...

 

Entry: www.mcafee.com

Why it's good: Using this entry, it would correctly match only for www.mcafee.com subdomains. It would not allow other subdomains of the top domain 'mcafee.com'. This is useful in case you wanted to allow a subdomain, but not the entire domain.

 

4.0.1_urlhost_belongs.png

 

Bad

Entries in "Bad: Only Domain List"

 

Entry: *.mcafee.com

Why it's bad: Using URL.Host.BelongsToDomains does not need wildcards, the property requires an exact domain match such as 'www.mcafee.com' or the top domain 'mcafee.com'.

 

4.0.2_urlhost_belongs.png

 

 

 

Test Ruleset

You can use the test ruleset in your own environment to see how it works! The test ruleset will work in versions 7.4.1+.

 

 

Conclusion

From the examples, it should be clear that the cleanest/easiest way to create domain based whitelist entries is through the use of the "URL.SmartMatch" property. I hope this helps clarify use cases for the various URL related properties, perhaps it will help with understanding other properties as well.

 

 

Changelog

2014-10-28 - URL related regex entries were invalid. Updated examples to be http:// instead of hxxp://. Added information regarding URL.SmartMatch property.

Labels (1)
Attachments
Comments

Great article as usual Jon- my only question is what is the best recommendation to differentiate between what is whitelisted externally ( external URL's \ host\ ip addresses,etc.)  and what is whitelisted internally especially if you are using NTLMor similar to authenticate internal users.  There must be a better more structured way

At the moment we bunch all of this onto several rulesets :

1.    Global Whitelist\------for all sites allowed inside the company

a.    Global domain allow

b.    Global Url.host

c.    Global IP allow

2.    Global Bypass NTLM\  ----for internal websites authenticate

a.    Global No Need to authenticate Users---- just what it implies

b.    Global No need to authenticate servers---just as it implies

Hey Carlos,

This guide is mainly geared at showing you how to use the properties and how to correctly create list entries. What you do with them in the rules is up to you.

One could easily create authentication exemptions using the properties outlined above.

For questions about authentication and organizing rules I would recommend creating a discussion thread or checking out my authentication guide: https://community.mcafee.com/docs/DOC-4384

In that guide I discuss common exception examples for authentication.

Best,

Jon

This article is really helpful.  It should be added to your standard documentation.

Funny you say that! It actually was!

Web Gateway 7.3.2 Product Guide - https://kc.mcafee.com/corporate/index?page=content&id=PD24502 page 245

regex((.*\.|\.?)mcafee.com) 

- not correctly

regex((.*\.|\.?)mcafee\.com)

Hi ,

I just noticed this comment!

I will be updating this document to fix the error you mentioned as well as two more. I will also be adding details regarding the new URL.SmartMatch property (added in 7.4.1).

Thanks for pointing this out, sorry I missed your comment...

Best Regards,

Jon

I have created a rule

client.IP is in list & URL is in list,where in URL list I have added URL http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091393

but when user is getting the page without the images,its luk like that half of the page is open,do we need to allow the category for that but if we do so,user will b able to access other URLs part of that category.

Even the url http://www.mcafee.com/us/products/web-gateway.aspxin ur above example is opening in the same way

Hi Haaris,

That is a correct observation. Whitelisting a single URL will in most cases not work as you expect. You will need to allow the domain to go with it, if you do not then CSS, javascript, and other content will probably not load.

The very first example () is an actual use case for whitelisting a *full* URL.

The URL (http://www.mcafee.com/us/products/web-gateway.aspx) used in the document was merely for demonstration of correct use of the various URL properties.

Best Regards,

Jon

edit: "of correct use of the various URL properties"

But If i have to allow a specific URL ,it will not open as expected so if we allow the domain then what is the point in using URL.path or URL for specific URL as by domain it will allow full URL.

Please explain me,ur views might help m

Great Document.

Fantastic, thank you for this! But I have a question about the regex when it comes to setting the .com, .net and so on.

is there anything wrong with doing it like this:

regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.(co\.uk|[\w]*)(\/.*|\/?))

so instead of using (com|net|co\.uk) I placed this (co\.uk|[\w]*)

Only issue I see is that it will match something like http://www.mcafee.commmmmm and such, but not http://www.mcafee.maliciousdomain.com, but is there any security issues?

Hi Kozzy,

What is the benefit to your version over the example provided? The regex you created is matching on things like www.mcafee.org I suggest you use a site like http://regexr.com/ to test out your syntax.

Regards,

Tris

Hi, exactly and that is the benefit, lets say you want to make one for google, google has many domains, .com, .net, .se, pl, .it, pretty much for every country, so instead of putting each one i use (co\.uk|[\w]*)

Google is just an example, many companies have sites in each country, and like you wrote, this example works ofr things like www.mcafee.org or www.mcafee.net and so on, without really needing to know all the domains McAfee has.

Primarily this is useful for companies that are global but use the same Web Gateway policy.

If you are just looking to do country code TLD's and other common domains then you could do something like this:

regex(^htt(p|ps):\/\/([\w]*\.)?google\.[\w]{2,3})

looking at List of Internet top-level domains - Wikipedia, the free encyclopedia all the domains listed at 2 or 3 characters long.

If you want to cover off co.uk then you could do something like this instead:

regex(^htt(p|ps):\/\/([\w]*\.)?google\.[\w]{2,3}(\.[\w]{2})?)

Which includes an optional match for a 2 letter domain on the end.

Regards,

Tris

This one does the trick for me : regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.(co\.uk|[\w]{2,3})(\/.*|\/?))

Tested it and it covers all domains up to 3 letters including co.uk, matches input like http://test-site.mcafee.com/someplace

I have one question for allowing specific TLDs.

Let's allow *.gov.in

If I use regex I would use something like this:

URL.Domain "matches in list" regex(.*\.gov\.in)

I would prefer using URL.Smartmatch rather than regex if possible, so my question is: is it dangerous to use

URL.SmartMatch(.gov.in) equals True

My fear is that URL.SmartMatch would also allow:

http://www.myhackedsite.com/getowned.gov.in/badscript.jsp

Is there a way without regex to allow specific TLDs (which can be tricky)?

Is the property URL.DomainSuffix smart enough to know the TLD?

Hi BelVincent,

This is exactly what URL.DomainSuffix was created for, no need for regex either. Just URL.DomainSuffix equals 'gov.in'

I checked SmartMatch, and it seems to handle it correctly. Adding 'gov.in' to a SmartMatch list would *not* allow hxxp://myhackedsite.com/getowned.gov.in, nor would it allow hxxp://gov.in.getowned.com. The only thing it would allow is hxxp(s)://*.gov.in

Best Regards,

Jon

Thanks a lot for your answer Jon!

Please correct me if I am wrong, but I want to be sure I understood what the support said.

A string entered in the list tested in a URL.SmartMatch rule will is tested against the URL being called, based on URL, URL Host and URL Path.

So we have to avoid entries containing a '?' in the SmartMatch list as it is getting interpreted.

A simple example : you want to allow a specific Youtube video while blocking all other Youtube content (the Youtube APIv3 can be used on local proxies, but not on a SaaS policy sadly, so we do not really much of a choice here).

So you add this URL in the SmartMatch list

https://www.youtube.com/watch?v=EoTqx9mVsu4

What you are actually allowing are all videos on Youtube, because the URL.Path all Youtube videos (https://www.youtube.com/watch) match the URL.Path of this entry (https://www.youtube.com/watch).

So this URL will also be allowed for instance:

https://www.youtube.com/watch?v=dfRzLP_KrBQ

In this case you want to use something else than SmartMatch for filtering, because, if I may rewrite the entry in SmartMatch:

The URL.SmartMatch property will accept the list as "input", and return TRUE if the given URL, URL host, or URL path variation was found in the list or URL path matches any URL Path of any entry in the list.

Hi Vincent!

You can add full URLs to the list and they would work as expected (like https://www.youtube.com/watch?v=EoTqx9mVsu4 or https://www.youtube.com/watch?v=dfRzLP_KrBQ).

You are correct that all landing pages for videos would be allowed if you add "https://www.youtube.com/watch" to a global whitelist using SmartMatch.

Attempting to allow based on the URL parameters alone would not work (?v=EoTqx9mVsu4), nor would using the unique string in the parameter (EoTqx9mVsu4). For this you would need to use the URL.Path parameter paired with a regex list.

Best Regards,

Jon

Then I have some trouble understanding this in the rule tracing, can you explain a bit more?

This is my test ruleset with a single entry in the list for this example:

RuleSet.png

This is what I see in Rule Tracing when I watch another Video ID:

RuleTracing.png

As you see, SmartMatch stops at the ? parameters.

You could use a matches to get the parameter, but what happens when the v= is not right after the ?

https://www.youtube.com/watch?something=else&v=ABCDEFG

Then you would have to use a wildcard to get to:

URL matches

https://www.youtube.com/watch?*v=ABCDEFG*

Or you could look for the parameter directly using:

Application.Name equals YouTube AND

URL.Path equals "/watch" AND

URL.HasParameter("v") equals true AND

URL.GetParamter("v") is in list ListOfVideoIDs

Where ListOfVideoIDs is a list of just the IDs you want to whitelist, like:

dfRzLP_KrBQ

8lMxpDYA5Wg

D56wGhy6qkk

LnU0Xh5_nIQ

Thanks Eric it is much more clear now.

We already use the Youtube API v3 to allow selected videos, but some admins thought they could use full URLs as well. I will send a communication on this.

Support pointed me here with questions about McAfee Client Proxy bypass entries. Is there a similar reference for the best way to set up bypasses in MCP?  The documentation doesn't say much about formats or wildcards.

can we use ? wildcard for changing numbers

Contributors
Version history
Revision #:
2 of 2
Last update:
‎03-20-2018 01:10 PM
Updated by:
 

Community Help Hub

    New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.

  • Find Forum FAQs
  • Learn How to Earn Badges
  • Ask for Help
Go to Community Help

Join the Community

    Thousands of customers use the McAfee Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership:

  • Get helpful solutions from McAfee experts.
  • Stay connected to product conversations that matter to you.
  • Participate in product groups led by McAfee employees.
Join the Community
Join the Community