Content Filtering in the MEG is a very powerful tool, although many customers don't use it or don't use it to its full potential.  MEG content filtering consists of two parts.  The Content Filtering Policies and the Content Filtering Dictionaries.  Considering the power inherent in the dictionaries themselves, I am going to discuss the dictionaries themselves and leave discussion of the policies for another time.  Primarily, this post will detail power available in score-based dictionaries and non-score-based simple-strings dictionaries.  This post will also detail some best practices for content filtering where the dictionaries are directly involved.


Dictionaries are either score-based dictionaries or non-score-based.  I will discuss each of these types of dictionaries below.  When creating a dictionary it is necessary to choose whether the new dictionary will contain Simple Strings or Regular Expressions, but this is not set in stone.  As long as the dictionary is not a score-based dictionary, new conditional term sets may be added, and each additional term set can be individually defined as either simple strings or regular expressions.  Some of the most powerful features of the content filtering tool are only available in non-score-based dictionaries.  This is because score-based dictionaries disable the and/or logical dictionary sections and also disables contextual matching on simple strings.


Score-based dictionaries


Score-based dictionaries are dictionaries where all the terms have a score which contributes towards a total score for the dictionary.  Once the dictionary score rises above a threshold set in the policy, the policy action triggers on the message.  Score-based dictionaries consider the whole message, including all attachments, when generating their score.  When generating a new dictionary, it is not immediately possible to define it as score-based in the new dictionary screen.  Should an admin wish to change an existing dictionary to be score-based, there are two ways to do it.  First, when looking at the dictionary in the list of dictionaries, click the paper and pencil icon below the Edit column.



This will open a new window which allows the option to make the dictionary score-based or not.




The other way to make a dictionary score-based is to click the Add link under the score column next to a term in an existing dictionary. 




When taking either of these steps, the admin will be asked to provide a default score for all messages in the dictionary.  Additionally, the window will advise about several changes which occur to the dictionary.  These changes are as follows:


  • Conditional (AND/OR) logic does not apply
  • Any existing complex terms will be removed from the dictionary
  • You will not be able to add complex terms to the dictionary
  • All remaining terms will be assigned the default score that you specify here
  • If you do not specify a score, all scores will be infinite.


Make sure to keep these changes in mind when making an existing dictionary score-based, as they could result in significant changes to the dictionary content.  It's also worth noting that, since conditional logic does not apply, an admin wishing to create a score-based dictionary should make that decision early as a dictionary section cannot contain both regular expressions and simple strings, and conditional logic is no longer available to allow admins to create additional sections to cover other types of matches.


Non-score-based dictionaries


While score-based dictionaries are not hard to create, all such dictionaries start out as non-score-based dictionaries.  Although there is a good bit of power in a score-based dictionary, non-score-based dictionaries have more raw power available to them.  That power does come at a price, however.  Non-score-based dictionaries apply only to the current scanning context.  As a result, only a non-score-based dictionary can act as an exclude dictionary for a non-score-based dictionary, and only a score-based dictionary can act as an exclude dictionary for a score-based dictionary.  Also, non-score-based dictionaries apply only to the current scanning context.  Thus, if a dictionary applies to multiple message parts, it is possible for a dictionary to trigger a rule in more than one part of the message, but only trigger an exception in one of those parts, resulting in Content Filtering taking an action even though the exception took effect on one part of the message. 




So then we get to the power of a non-score-based dictionary.  There are two different places where this power becomes apparent.  First, it appears in the conditional logic available.  In these dictionaries, users may specify conditional logic to take effect in different parts of the message.  To create a new condition, use the "Add OR condition" or "Add AND condition" buttons at the bottom.  Or conditions appear in the pull-down box at the top of the dictionary section, and conditions will appear within the terms list.  Thus, it would be possible to create a dictionary where it was necessary for one of five terms to appear in the subject line *AND* one of these three other terms to appear in the body in order for the dictionary to trigger.  It's also possible to create an "or" dictionary section where any of the terms in the first section *OR* any of the terms in the second section must appear for the dictionary to trigger an action.  Of course, it can get even more complicated than that with multiple "and" conditions inside the various "or" conditional sections.  Further, admins can specify to what part of the message the terms within a section apply.  To do this, click the link that says "Everything" next to "Applies To".  This will bring up a window with a list of the various parts of the message to which a term may be applied. 


Additionally, terms may be complex.  When creating a term, admins are given a place to enter the term to be applied.  To the right there is an edit icon used to edit the contents of the term.  Under the Term Details, users can specify (via checkboxes) whether the term is a wildcarded term, the start of a string, the end of a string, or case sensitive.  Once the admin moves to Contextual matching, the real power becomes available.




Using contextual matching, users can indicate that matching should occur only if one of these other terms is present, only if all of these other terms are present, or only if none of these other terms are present.  Additionally, near matching can be done, requiring the other term(s) to be present within a certain number of characters of the original term.  As before, admins can specify that the new terms are case sensitive, contain wildcards, and/or start or end a term.  This allows the admin to specify complicated conditions for if a term applies.  It's worth noting that only one condition can be set on the trigger at a time.  That is, it is not possible to trigger on the dictionary term only if alternate term a is present and alternate term b is not present.


I hope this helps to take some of the mystery out of dictionaries and the power and capabilities of them.