if you enable the cloud lookups all URLs which are not categorized are automatically received by McAfee. Unfortunately there is no way to access the reported URLs or obtain any feedback, unless you manually report them.
A different approach may be to write uncateogorized URLs into a separate log files (wrote only URL.Host properties) and use something like
cat log.txt | sort | uniq
to drop all duplicate entries. The result is a list that you can submit manually to Trusted Source.
An automated feedback as it was present in 6.x is also planned and may be available in future versions.
how long does it take until the URL is categorized??
So, if customer is coaching uncategorized websites this website will be automatically categorized? Is this right?
A customer is using the "cloud lookups" but it seems the uncategorized website are not categorized automatically. Can this be??
Today i testet this command: sqlite3 /opt/mwg/lock/statistic/statistics.db "select * from childs whre key like '%uncategorized%';"
This shows the entries from the dashboard under Charts and Tables -> URL Filter Statistics -> Sites not categorized by Hits.
All is fine, but when i remove the entries the Dashboard is not updated until the mwg services are restarted. Any idea?? :-)
I am not really sure what exactly happens when cloud-looked-up URLs arrive at McAfee. They go into some process I don´t know. I will try if I can find someone who can give me some additional information about what happens to this URLs.
I assume the statistics.db is only read when MWG starts. During runtime MWG will only write to the file, but keep the content in memory. I will try to find out if there is a different command to re-read the database :-)
this would be perfect. In fact we have this goal.
- sqlite3 query to fetch the uncategorized urls.
- writing the result into a file.
- sending the file per cron job to mcafee.
- removing the entries from the database.
the cron job should be started only every 2-3 day to prevent duplicate sending of uncategorized urls. :-)
I understand. Basically I have to note that the sqlite databases that not intended to be "externally" modified. So while this may work fine I cannot guarantee that there will be no side effects, so I would rather not run this on production environments - just on your own risk :-) There may be dependencies that do not make sense to us, but which may cause MWG to behave strange or crash if we write on data that is not supposed to be written.
However I will try to find out if there is (technically) a way to reload the statistics without a restart, but anyway, no warranties here.
For just getting a list of uncategorized URLs I would suggest to write a separate log file, as mentiones earlier. You could rotate that log using the internal log file manager and once it is rotated run a cronjob that modifies the log and remove all the duplicates (should be a bash one-liner). Afterwards you can send it by eMail or transfer it to a different place.
I am not aware of a good way to batch-submit URLs to McAfee labs. The portal and the eMail address are - as far as I know - designed for manual submissions. I will try to find out a proper way to submit lists of uncategorized URLs.
i talked to my customer today. Clud Lookups are active. We tested with a not categorized website. The website is not categorized since a few weeks.
We tested with the URL www.pichler.at.
I talked to engineering and there is no way to reload the dashboard data when modifying the SQLite database which shows the uncategorized URL table. The file is only read when MWG starts, during runtime the statistics live in the memory. The only way is to wipe the complete dashboard data, which can be done from the UI.
Uncategorized URLs that come in via Cloud Lookups actually go into a very big bucket of URLs and categorized from there. Because there are so many URLs reported the categorization can take an unpredictable amount of time (or with different words, do not rely that the URL will get categorized automatically in time). This wii improve in the future with additioan resources for automatic categorization, but at the momeht it works like this. To obtain/report uncategorized URLs I would recommend to add an event to the "Allow Uncategorized URLs" rule, that writes down URL.Host into a separate file. Once a day/week you could reduce the file by running
cat uncategorized_urls.log | sort | uniq > URLs_to_Report.txt
This will create a new text file which cointains all uncategorized URLs only once to avoid duplicate reporting. You may also want to add some greps to filter out internal IP addresses and hostnames, such as
cat uncategorized_urls.log | grep -v 192.168.0 | sort | uniq > URLs_to_Report.txt
What is the idea of reporting the URLs manually? To put more emphasis on that specific URLs?
Before doing some work on that, we need to find out, if manual reported URLs have higher priority.
Some more notes:
The web interface only accepts 100 (?) URLs per day and user.
Long lists of URLs could also be reported to firstname.lastname@example.org.
my customer has many uncategorized websites. Today uncategorized websites are "coached". Submitting uncategorized websites to mcafee is no technical problem.
BUT, if there are up to 25 MWG appliances and 10000 up to 50000 users with up to 9000 req/sec there are a lot of uncategorized websites. There are also different IT-departments and different admins for the proxy systems.
Manually sending the websites to mcafee is too much work, therefore this should be done automatically.
Therefore it should be possible to send uncategorized websites automatically to McAfee.
- websites should also not reported twice.
Today cloud lookups for uncategorized websites should be noticed by mcafee and added to the URL Filter database. We checked this with our customer, but for about 3 weeks the website is not categorized.