cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

Capturing Web Spider/Crawler

Hi Friends

We have been effected by the problem of WEB spider/crawlers as they continuously keep on collecting data from our website and use the content on their portals. Since in my setup i don't have a WAF deployed i thought of making a co-relation rule which might help in determining these software.The website i am trying to protect is used by a very large no of user over the day.

The thought/Rule.

Since the characteristics of a Web Spider/Crawler is to go through a web site at a very rapid speed during a short span of time , we can use this create an alert.

for example :

group by Source IP , Destination IP

command = GET,PUT

duration = 10 seconds, distinct value : 250

From this alert we can create a Watch-list , which can be then used to monitor activities by these IP address.

such as

Data byte sent

No of request.

Though i m little circumspect whether the duration filter will work for this small a duration or not. Also there may be chances of a false positive as ISP uses NAT for providing internet to user and may end u hitting threshold defined by us.

Do tell me what you think of this and if any thing can be further added to this.

More McAfee Tools to Help You
  • Subscription Service Notification (SNS)
  • How-to: Endpoint Removal Tool
  • Support: Endpoint Security
  • eSupport: Policy Orchestrator