I want to block pages that contains words like:
World Of Warcraft, Lineage, (list of this words is more than 10 words)
Body.PositionOfPattern() is not satisfied me.
Another question - how to block contents in different codepage than utf-8 (for example windows-1251) - convert to hex ?
PS. MWG7 (22.214.171.124)
i found rule only like that
I need to comment on this rule because it is not a good example for the usage of the property Body.PositionOfPattern.
This rule has two problems:
1) If you are looking for a byte pattern given in hex digits, the pattern must be enclosed with double quotes. Otherwise the property will look for the hex notation of the pattern.
2) The property must start searching at position/offset 0 if it should start at the first byte of the body.
Look at the correct example:
As Alex already said, Body.PositionOfPattern is not a good choice here anyway, because it is meant for searching a byte pattern, not text.
FelixMessage was edited by: fschulte on 4/17/12 9:20:22 AM CDT
You are right, page has not detected by medif type, after change ruleset filter all works as expected:
But this page () cathed only by rule:
Body.PositionOfPattern ("edebe0e9ed20e8e3f0", 1, 2000) greater then 0Message was edited by: apellepa on 4/17/12 10:20:26 PM EEST
You need to check strings against Body.Text property, as it extracts only text, skipping not necessary markup. Another plus for it is that it takes encoding into account.
So your rule should look something like:
IF (MediaType.EnsuredTypes contains 'text/html') AND (Body.Text matches in list "your list of words") THEN Block
your list of words it's better save as Wildcard list (but you can put only words that you're interested in).
It looks like we have some issues media type detection - page contains not allowed characters at start of page, so it not detected as text/html...