3 Replies Latest reply on Jun 1, 2010 2:52 AM by asabban

    Bing and Google cached pages

      Im having some problems with Bing cached pages.  When using a cached page returned from a Google search Webwasher categorises the acutal page correctly rather than the Google Cache page as anonymising.  e.g. if I search got playboy.com and click a cachewd page link it blocks as pornography as per my policy but cached sites that are allowed in my policy are allowed.  I think this has something to do with the fact that the URL for the cached page still contains the original URL for the site your clicking the cached link.

       

      Bing however doesnt work the same.  in Bing if I search for playboy.com I cant eneter the actual site but I can enter the cached page links.  Bing still trys to fill some of the content like images from the actual site (not cached) ansd these get blocked but I dont want them to be able to access it at all if the original site is blocked ni the categories.

       

      Any ideas?

        • 1. Re: Bing and Google cached pages
          asabban

          Hello,

           

          I just gave this a quick try. When searching for "playboy.com" and copy the link to the cached pages, the result is:

           

          http://......./cache.aspx?q=playboy.com&d=4594309435361004&mkt=de-DE&setlang=de- DE&w=eceddedb,55a312fc (Just removed the Hostname to break the Link...)

           

          If I look that up with Trusted Source the result is:

           

          Anonymizing Utilities, Nudity, Search Engines, Pornography

           

          This seems to be categorized correctly as Anonymizing Utilities and also the original Playboy.com categorizations. This should be blocked.

           

          Can you let me know if the Link to the cached pages looks different? Can you give me an example?

           

          best,

          Andre

           

           

          Nachricht geändert durch Andre Sabban on 31.05.10 03:12:04 CDT
          • 2. Re: Bing and Google cached pages

            Thanks Andre I actualyl get the same results too so that helps.  Now here is my bigger problem.  I want to allow access to just Google and Bing cached pages but not to all Anonymizing Utilities sites.  At first I though aboput whitelisting the domain for these from URL categorisation e.g. whitelist cc.bingj.com from URL categorisation but then it wouldnt get blocked if it was also a Pornogrpahic site like playboy.com

             

            How can I allow these cached pages without overriding the "sub" categorisation?

            • 3. Re: Bing and Google cached pages
              asabban

              Hello,

               

              well I don't think this is very easy to achieve in Webwasher 6. You can try to use "URL Filter -> Filter by Expression", put in the URLs

               

              webcache.googleusercontent.com

              cc.bingj.com

               

              and check "Allow matching URL". For me that allows to at least see a copy of the Text of the Website. Of course all downloads of images or stylesheets which go to blocked URLs or contain the blocked URL in the name and NOT point to the above URLs will still be blocked.

               

              Furthermore you may want to experiment with the "Exempt" action (see KB63919), which explains how to create an action which allows to "override" block action under specific circumstances.

               


              Additionally if you want Webwasher to be less restrictive in regards to URLs or keywords being part of the request, you may want to check what the following options do (URL Filter -> URL Filtering methods):

               

                    Skip searching the CGI parameters when categorizing URLs.
                    Skip searching for and categorizing embedded URLs.
                    Skip additional categorization of a search engine request, such as a query to Google, based on the keywords present in the search.

               

              Unfortunately all of these advices are pretty generic. If you need additional assistance it may probably make sense to have a service request created, upload a feedback with the configuration you are currently using and name some examples of what you try, that the result is and what you would like to achieve. With these specific example we/I may be able to help you more detailled.

               

              If you want to make some experiments on your own please try my above mentioned hints.

               

              Thanks,

              Andre