Hot-Rodding your Rulesets -- Optimizing your MWG7 Policy

Version 2

     

    FOREWORD

     

    One of the most beautiful things about McAfee Web Gateway 7 is its near infinite flexibility. If you can dream up an idea for a rule, you can probably make it happen. This flexibility can be a double-edged sword however - and while it allows you nearly limitless potential, it also introduces the possibility of inefficiencies in your ruleset logic.

     

    My goal with this document is to equip you with the understanding and technology such that you can get the best performance possible out of your MWG appliance.

     

    CYCLES

     

    Before we can begin discussing rulesets, rules, and how to lay them out, we must first grasp the concept of cycles.

     

    McAfee Web Gateway has 4 different cycles. I will touch on all of them here briefly, but for the rest of the document, we’re only going to be concerning ourselves with the first three.

     

    • Request
    • Response
    • Embedded
    • Logging


    The Request Cycle works with anything available in the initial user request. This means things like URL, Client IP, User Name (if we’re authenticating), and Headers sent by the client’s browser will be available.


    The Response Cycle works with, as you’d guess, the response coming back from the webserver after we’ve completed the request cycle. This will have the actual data requested, as well as any server-side headers.


    The Embedded Cycle comes into play when we have an opener called in either the Request or Response Cycle. Openers allow your MWG to look more deeply into content of a given type. Currently, the two openers available are:

     

    • Composite Opener - used for looking inside of other files -- such as .zip, .exe, etc
    • HTML Opener - very rarely used, typically only in very advanced and specialized configs


    The Logging Cycle kicks off after the Request, Response and all Embedded cycles have completed -- allowing you to write to log files.


    Important note about the logging cycle

     

    If you have values in your access.log that you always want filled (for example: category information) -- you might have to create a rule at the very top of your rule sets to call the properties that fill these values, like so:

     

    url-categories-population.png

     

    The action on this kind of population rule would be 'Continue' - as we don't want to block the traffic, but we still want all the subsequent rules to apply.

    Without this kind of rule, some log fields may have blank entries if we are hitting a block action or a stop-cycle action before a specific point.

     

    Here is an example to illustrate better what the issue is:

    Your very first rule is your global whitelist, which contains youtube.com and the action is stop cycle. When a user goes to youtube.com, the request will be allowed, but your log files will not show the category information for youtube.com because the property url.categories was never called in the rules. To prevent this you can create an initialization rule above your global whitelist that uses the property url.categories.

     

     

     


    Here is a graphical representation of how the request and response cycle work, when handling a request from a client.

    cycles.png

     

     

    As general principle, you ideally want to try and get any traffic that is going to be ‘blocked’ out of the way as soon as possible (to limit the amount of work your MWG needs to do to block the traffic).


    For an example - take URL Filtering. We will have the URL and can perform URL filtering in the request cycle (which is ideally where we’d want to do it). If we were to perform URL filtering in the response cycle - we would have already retrieved the page, only to find out the request is to be blocked.


    Information that we have right off the bat (URL, Client.IP, etc) should always be checked for in the request cycle if possible. If we’re checking the Client.IP or URL in the request cycle and allowing it, what’s the point of checking the same Client.IP/URL again in the response cycle? It’s simply additional work that doesn’t need to be done.
    As a rule of thumb, these are the kinds of things you’d want to be doing in each cycle:


    Request:

    • URL Filtering
    • Blacklisting
    • User Authentication
    • Rules based on browser-sent headers (User-Agent, etc)
    • Antimalware Scanning for Uploads

     

    Response:

    • Antimalware Scanning
    • Media Type Filtering
    • Rules based on website-sent headers (Content-Length, etc)

     

    Embedded:

    • Body filtering for specific content
    • Antimalware Scanning (if using Composite Opener to look into archives/etc)
    • Media Type Filtering


    There are some other things that we will sometimes want to occur in both request and response cycles - such as Whitelisting.

     

    CRITERIA


    With both rulesets and rules, we can specify criteria to limit when a particular rule or ruleset will trigger. This is useful as we generally will not want to apply the same rules to all users across the board.


    Some of the most common criteria used are:

     

    • URL / URL.Host (used for looking at URL or URL host)
    • Client.IP (IP address of client machine making request)
    • URL.Categories (Categories the requested URL falls into from the TrustedSource db)
    • Proxy.Port (The current proxy port being used - can be useful to differenciate between different clients)
    • System.Hostname (Useful if you want rules to only occur on a single MWG in a cluster)


    There are many more criteria available, you can see a full listing of them in the product guide, addendum A.


    One important thing to keep in mind regarding criteria is that calling criteria that has not yet been filled will initiate whatever mechanism is required to fill it.
    For example, the first time you call the criteria of URL.Categories - MWG will, at that point in the processing, perform a URL lookup. Likewise, the first time you call Antimalware.Infected, MWG will start it’s antimalware scanning process.


    Because of this, it is very important for optimal performance that we structure our ruleset in a manner that attempts to check as many ‘cheap’ criteria as we can first -- before we resort to ‘expensive’ criteria such as Antimalware scanning. This is true both on a large scale ruleset design, as well as on a smaller-scale when dealing with multiple criteria for a rule or ruleset.


    If we can block a website based on having undesirable URL categories, then your MWG will never have to scan it for viruses since the traffic will already have been blocked.


    Criteria by Cost/Weight

     

    LowMediumHigh
    Client.IPURL.Destination.IP*Antimalware.Infected
    URL // URL.HostMedia.EnsuredTypesDLP
    Proxy.IP // Proxy.PortAuthentication*HTML Opener (Event)
    URL.Categories*Composite Opener (Event)
    System.Hostname
    HTTP Headers

    * Some of these proerties rely on external services like Active Directory, DNS or cloud lookups that could introduce delays beyond the control of MWG


    LOGIC


    It is possible to combine multiple criteria together. With two criteria, the logical operators of AND and OR come into play. It’s important to note that with an AND statement, if the first criteria checked is false, it will not check the second -- and the same is true with an OR statememt if the first criteria is true.


    As a general rule of thumb, you will want to use the least expensive of the two as your first criteria.


    AND rule (two variables):


    and-rule.png


    MWG will first check the Client’s IP address. If it does not match, the rule will not be applied and it would continue moving on in the ruleset.

    If the Client IP was 1.2.3.4, then and only then would MWG do the additional check of looking at the URL Host to see if it matched the wildcard of *abcd.com. If it did, the request would be blocked, if it didn’t, it will not apply the rule and the traffic will continue on.


    We want to check the client IP for a match first, because if the user has a different IP, then we will not have to check the regex match against the URL.Host. Using a ‘matches’ action with wildcards/asterisks isn’t incredibly taxing, but it is more than a direct comparison check against the Client.IP.


    OR rule (two variables):


    or-rule.png


    With this ruleset, if the first parameter is true, we won’t bother to check the second since we will already have confirmed the criteria as true.


    So, we will first check to see if the URL.Host matches *xyz.com. If it does, we will stop the search and apply our action (Block).

    If it doesn’t match, we will proceed further and check the URL.Destination.IP (by performing a DNS lookup), and then check to see if it matches in the range. If it is in the range, we will block. If not, then the rule does not match either parameter and will not be applied.


    In this example, we’re making use of a URL.Host wildcard lookup first, because the amount of work and latency introduced by doing a quick wildcard check against the URL.Host (which is a value we have right from the start) is much lower of an impact than asking your MWG to go out and do a reverse DNS lookup -- which is what the URL.Destination.IP criteria has your MWG do.


    If we match *xyz.com -- there will be no need to check the second criteria because this is an OR statement.


    More than two criteria:


    When you get beyond 2 criteria, you can involve another level of complexity -- that of parentheses (). These work just like they did in algebra class -- meaning whatever is in them will be evaluated first.


    We generally suggest keeping your rules as straightforward as possible (ideally no more than 2 criteria per rule/ruleset) -- not because MWG cannot handle the complexity -- but more because dealing with incredibly complex rules can be very difficult to read for you as the administrator later on.


    Compare the two sets of rules and see which is easier to understand logically.


    complex-multi.png

     

    simple-multi.png

     


    Both of these rulesets accomplish the same thing -- the first is all done as a single rule with 4 different criteria and a couple sets of parenthesis.


    The second is accomplished by splitting the logic out to multiple rules, no more than 2 criteria per ruleset. It’s also much easier to read and understand on first glance.


    If the URL.Host matches *testdomain.com, then we check the proxy port. Any proxy ports other than 9090 will result in a block page. After that, we check to see if the user is not an admin user (signified by the group membership and IP range), and if they are not, they will be blocked.


    One last note about criteria - if you find yourself wanting to add more than 2 of any specific thing (Client.IPs, URL.Host checks, etc) -- or if you see yourself wanting to add to them in the future, you would be well served to create a list and then use the criteria of ‘is in list’ or ‘matches in list’.


    This will help to keep your criteria neat and easy to read, but allow you to have large lists of data in situations where it might be appropriate/necessary (such as whitelist/blacklists, group policy assignments, etc).


    Here’s an example whitelist ruleset that makes use of lists:


    lists-example.png

    RULESETS AND POLICY ARCHITECTURE


    Now that we have an understanding of Cycles, we can have a look into rulesets. Rulesets are means by which we organize our rules and sub-rulesets, and make a configuration easier to understand and manage.


    Rulesets are also where we specify what cycle(s) the rules and sub-rulesets within will be configured to run in. Much like with our criteria, it’s important for maximum performance to structure your ruleset in a manner that progresses from ‘least expensive’ to ‘most expensive’.


    While no one specific layout is necessarily ‘correct’ -- from reviewing a number configurations, a general rule of thumb would be a ruleset that looked a little something like this:

     

    • Whitelists/Blacklist
    • SSL Scanner
    • Authentication
    • URL Category filtering
    • Common rules (cache/progress indications/composite opener)
    • Media Type Filtering
    • Gateway Anti-Malware


    Obviously, all of these are optional - as you can pick and choose what rulesets you wish to use in your web gateway configuration.


    The vast majority of customers tend to go with a rather stock layout when it comes to the majority of the rulesets. By and large, the bulk of the customization comes by way of whitelist/blacklists, and applying URL Category Filtering based on criteria (Username/Group/IP/etc).

     

     

    USER-DEFINED PROPERTIES

     

    User-Defined Properties can help you when it comes to optimizing the amount of checks that need to be done for rule evaluations.

    For example, group memberships in enterprise environments can get fairly complex. It is not uncommon to see users with several hundred group memberships in AD.

     

    On the web gateway side, you would have to check against that long list of groups every time you need the group membership for policy assigments. Instead, you could do the check once and write the resulting policy name into a user-defined property.

     

    user-defined-example.png

     

    Once you have your User-Defined variable set, you can decide which rules to apply based on this simple string variable instead of having to check the whole list of group memberships every time.

    The check of User-Defined.URLFilteringPolicy equals "Admins" is cheaper than the check for Authentication.UserGroups contains "Administrators".

     

    If you are interested in more information about policy mappings, please see this article:
    https://community.mcafee.com/docs/DOC-2210

     

    Keep in mind that this was just one example for the usage of User-Defined Properties. You can take advantage of this feature every time you need to temporarily store information for later use. User-Defined properties persist for the duration of a transaction (reuqest + response + logging)

     

     

    CONCLUSION


    You should now have a better understanding of how MWG works with cycles, logic and criteria -- and can use this knowledge to help weed out the inefficiencies in your configuration.

    Takeaways:

     

      • No more than 2 criteria per rule (for easy administration!)
      • Remember cheap vs expensive criteria! Block as much ‘cheaply’ as you can.
      • Use appropriate cycles for your rules. There’s no need to run a URL-Category rule in the response cycle, since we could have blocked it in the request and saved the time and bandwidth!