Following on from this thread:
I'd like to implement this for 3 significant domains on our site. I've writen the follwoing regex to match youtube, bbc and google domians
and I'd like to supliment them with squashed.bbc.co.uk squashed.youtube.com and squashed.google.com my log sources are Webwasher 6 howerver I seam to be unable to get it to implement the replacement.
I've tried setting under log sources > user defined columns > populate this column
setting it to look for URL, site, and also using the log headder "req_line"
none of those have had any effect as far as I can see, and the documentaion is a bit vaigue on this too.
Can any one provide any pointers to doing this? I'm guessing this will have an negitive impact on my log processing, however I expect it will help the DB, as for example I have over 96K unique domains for google that would be droped down to just 1, which with the reduced data to report on, and better understanding on usage of that site makes this quite and attractive way to go.....
If its not possible to do this in web reporter the way I want can I do this with my own sql query that would run after log processing jobs but before reporting jobs?
TrisMessage was edited by: trishoar - missing link on 16/09/11 11:47:21 CDT
Sorry for not responding sooner, as I've been out of the office until today.
1) You will not be able to reduce the number of records inserted since you are applying the ruleset to a new column, and have no ability to modify the column where the real URL/Site name is stored. So you are adding a detail item (which might help make reports better reflect you needs), but won't actually reduce the db records.
2) In regards to performance, it's really hard to say. Since the user defined columns are only for detail data, you could actually have slower reports if the data you require could have been pulled from summary data (which is about 10x smaller and better indexed). It really depends on what you need, and how exctaly you are doing it.
3) I would expect very little log parsing performance impact unless the regex is really bad because generally the log parsing bottlenecks on the DB side and the regex would be done by Web Reporter.
4) For the user defined column config, make sure you are not using "req_line" in the free input text box. Any of the recognized columns need to be selected from the dropdown or the rule won't work. If possible, attache screenshots of the config for your rule-set and user define column tab on the log source config.