0 Replies Latest reply on Jul 11, 2015 7:04 AM by ravismallah

    Capturing Web Spider/Crawler

    ravismallah

      Hi Friends

       

      We have been effected by the problem of WEB spider/crawlers as they continuously keep on collecting data from our website and use the content on their portals. Since in my setup i don't have a WAF deployed i thought of making a co-relation rule which might help in determining these software.The website i am trying to protect is used by a very large no of user over the day.

       

      The thought/Rule.

       

      Since the characteristics of a Web Spider/Crawler is to go through a web site at a very rapid speed during a short span of time , we can use this create an alert.

       

      for example :

       

      group by Source IP , Destination IP

      command = GET,PUT

      duration = 10 seconds, distinct value : 250

       

      From this alert we can create a Watch-list , which can be then used to monitor activities by these IP address.

       

      such as

      Data byte sent

      No of request.

       

      Though i m little circumspect whether the duration filter will work for this small a duration or not. Also there may be chances of a false positive as ISP uses NAT for providing internet to user and may end u hitting threshold defined by us.

       

      Do tell me what you think of this and if any thing can be further added to this.