DLP in MWG consists from 2 filters:
- DLP using predefined set of DLP classifications
- DLP using user-defined set of words & patterns (regexes)
Each filters is controlled by corresponding Settings, so before use, you need to go to Settings tab, and create new settings (both settings have common options described below):
- To use DLP classifications, you need to create corresponding Settings, and select needed DLP Classifications from tree view
- To use DLP dictionaries, you need to create corresponding Settings, and enter several words or patterns that you want to look in text
Common options in Settings:
- Tracking policy: Maximum - DLP filter will search for all terms, even when threshold will already reached. Minimum - stop searching if threshould reached
- Reported content width - how much text around match should be shown in results. The match itself in square brackets (for example, '[ test ]'), but using text around, you can find where it's in the text
- Context list size - how much matches will be reported back. If text will contain more matches than specified - they will be counted against threshold, but won't be shown
Both DLP filters are used similar way - you need to create rule like 'DLP.Classification.BodyText.Matched equals True' or 'DLP.Dictionary.BodyText.Matched equals True' and specify which settings you want to use. When this rule will be evaluated to 'true', then you can use properties like DLP.Classification.BodyText.MatchedClassifications or DLP.Classification.BodyText.MatchedTerms to log the list of classifications and/or terms that were found in text that was extracted from current body (document, etc.). For dictionaries, there is DLP.Dictionary.BodyText.MatchedTerms property that returns information about terms that were found in your text.
Besides DLP properties that works with text, extracted from current body (they have BodyText in their names), there is als another set of properties with 'AnyText' in the name - these properties has similar functionality, but can be used with any text, for example you can combine Body.Text with values of some headers, etc.
You can import examples of rules for DLP from MWG's Rule Library, but if you want to check only uploaded documents, then you need to disable DLP for response cycle (default rules consists from 2 rulesets - for requests & for responses).
I hope, that this will help you. If something isn't clear, I'll try to answer to your questions.
I did some tries. The attached rule works for a text, word and excel file.
But not for a PDF File.
Here an example of the pdf file properties which I try to analyze:
So do you have an idea how that could work?