cancel
Showing results for 
Search instead for 
Did you mean: 
apellepa
Level 8

How to block pages by its content

I want to block pages that contains words like:

World Of Warcraft, Lineage, (list of this words is more than 10 words)

Body.PositionOfPattern() is not satisfied me.

Another question - how to block contents in different codepage than utf-8 (for example windows-1251) - convert to hex ?

PS. MWG7 (7.1.6.1)

0 Kudos
8 Replies
apellepa
Level 8

Re: How to block pages by its content

i found rule only like that

OnlineGame.PNG

0 Kudos
fschulte
Level 10

Re: How to block pages by its content

apellepa wrote:

i found rule only like that

OnlineGame.PNG

I need to comment on this rule because it is not a good example for the usage of the property Body.PositionOfPattern.

This rule has two problems:

1) If you are looking for a byte pattern given in hex digits, the pattern must be enclosed with double quotes. Otherwise the property will look for the hex notation of the pattern.

2) The property must start searching at position/offset 0 if it should start at the first byte of the body.

Look at the correct example:

bodypositionofpattern1.png

As Alex already said, Body.PositionOfPattern is not a good choice here anyway, because it is meant for searching a byte pattern, not text.

Ciao

Felix

Message was edited by: fschulte on 4/17/12 9:20:22 AM CDT
0 Kudos
apellepa
Level 8

Re: How to block pages by its content

Thanks alexott!

You are right, page has not detected by medif type, after change ruleset filter all works as expected:

games.PNG

But this page () cathed only by rule:

     Body.PositionOfPattern ("edebe0e9ed20e8e3f0", 1, 2000) greater then 0

Message was edited by: apellepa on 4/17/12 10:20:26 PM EEST
0 Kudos
stephaniec
Level 7

Re: How to block pages by its content

Hi

You need to create a string list containing the words and then a rule which matches in list on Body.ToString (Number, Number)

0 Kudos
alexott
Level 11

Re: How to block pages by its content

Hi.

it's better not to use Body.ToString as it has no information about encoding, etc.

0 Kudos
alexott
Level 11

Re: How to block pages by its content

You need to check strings against Body.Text property, as it extracts only text, skipping not necessary markup. Another plus for it is that it takes encoding into account.

So your rule should look something like:

IF (MediaType.EnsuredTypes contains 'text/html') AND (Body.Text matches in list "your list of words") THEN Block

your list of words it's better save as Wildcard list (but you can put only words that you're interested in).

apellepa
Level 8

Re: How to block pages by its content

I try this, but its not work as expected:

site http://www.playtime.co.ua/ should be blocked by keyword "*онлайн игры*" but this site is not blocked.

My rule:

BlockByKeyword.PNG

my lists

GamesLists.PNG

0 Kudos
alexott
Level 11

Re: How to block pages by its content

Hello

It looks like we have some issues media type detection - page contains not allowed characters at start of page, so it not detected as text/html...

0 Kudos