I'm finding difficulty from the MWG 22.214.171.124.0-35765 version (or 10.2.2) , as I could not see any details in the access.log to get more information like:
1. Actual time: what time scan started,
2. Total scan time it took
3. After scan, the response back to client : what time it was done
In our case: Because the ICAP server scan takes a long time for a zip/jar archives, not able to see the progress on the scan and as the scan taken more time, the end client did not receive a response in the allowed 5 mins window, the timeout (504) is seen on the client end as there was no response received back from the ICAP server.
Please can anyone share the options available here to fix this issue?
there are a couple of Timers and the Stopwatch Events which can be used to log the time. For example Timer.TimeInTransaction contains the time for the complete transaction, while Timer.TimeInExternals contains the time required for calling AV and waiting for the reply.
With the Stopwatch Events you can start and stop a Stopwatch exactly when desired. For example in the first Rule Set you can start the Stopwatch and stop if after GAM was done to see how long it took.
Generally it cannot be expected that MWG finishes filtering in 5 minutes. The ICAP client is required to keep the client busy by doing progress indication (data trickling, progress pages or something similar). Scanning a complex file can take up to 60 minutes.
Thank you for the kind reply.
I'm not able to see the "Timer.TimeInTransaction". How to enable it , on which screen?
Basically I'm trying to view the time taken by a session (during an upload), to get some more information for an end client during upload ==> " like ICAP receives request time, scan time taken, and time it gave response back to end client"
Secondly, I have enabled "data trickling " in access log (Rule sets > Log Handler). How does this work and How it helps?
In our case :
- after end client uploads a zip or jar file around 850MB, ICAP takes more time (above 5 minutes) to scan and by the time it responds the Akamai gateway connection dropped as it is configured for 5 min window for a response back from ICAP to the end client.
For the above process, we're not able to get the time details info : right from the end client upload to ICAP (start) till the response provided back to the end client (end).
Looking forward for your kind help.
the timers are properties. You need to add them to the policy on your own, for example by adding a rule and set a User-DefinedProperty to the value of the timer:
Set User-Defined.My.Timer = Timer.TimeInTransaction
Now the User-Defined Property "My.Timer" contains the time and it can be logged.
The progress indication (Data Trickling or Progress pages) must be performed by the ICAP client! If MWG is used as the ICAP server adding the Data trickling or Progres spage rule sets will not have a useful effect.
In our case MWG is used a ICAP server. (end clients are invoking the web end points directly) on 1344 port. So i understand as data trickling will not be helpful here. Is that right ?
about setting up the custom user defined property, i'm trying to add the rule as in screenshot attached.
Please can you advise on how to define this rule and the Set User-Defined.My.Timer = Timer.TimeInTransaction .
you are right, if MWG is the ICAP server data trickling or progress pages are not required. These mechanisms have to be provided by the ICAP client (e.g. the machine the browser is talking to).
So if your browser is talking to a service and this service is connecting to MWG as an ICAP server, it is the service who needs to take care that the browser is kept "busy" while we wait for the scanning to complete. On Blue Coat this is called "Patience Page", on MWG it is "Progress Pages" and most products support "Data Trickling" (e.g. giving some bytes to the client while scanning is still going on).
To see some numbers it might be the easiest to add Number.ToString(Timer.TimeInTransaction) to the access.log:
By doing so you can see the result of the timer in the access.log, once scanning is completed.
Thank you for the screenshot. I have now made the same change to the access log (handler) - screenshot attached. please can you check if this looks ok?
I did save it, do i have to bounce the stack to see the changes ?
secondly, in my case : between (MWG as ICAP server) <==> Akamai <==> end client . The Akamai closes the connection in 5 mins even before the ICAP server finishes the scan and hence the client sees the 504 (time-out).
1. How to achieve a (similar) to "data trickling" while the ICAP server is still scanning, so to keep the Akamai connection "ALive"
2. Is there anyway, to check the file-name getting uploaded from the logs?
Please can you let me know the possibile options.
if you save the log handler the next transaction that comes in will use the new policy and the modified log handler. So you should see the "Time in Transaction" for new requests in the access.log.
So it seems that "Akamai" is the ICAP client. The browser connects to Akamai and Akamai hands over the request to MWG for scanning?
1.) Akamai needs to take care for this. MWG cannot cause the ICAP client to keep the client busy.
2.) This question is very generic, as it depends on the upload. If you just make a "PUT http://www.mwginternal.com/upload/andre.zip" to upload "andre.zip" the file name will be in Body.Filename or can simply be obtained from the URL by using URL.Filename.
But usually when Uploads are done with a form there is a POST request with a multipart/form-data object. In this case there must be a rule that reads the form field with the correct name and read the Content-Disposition headers from it.
If there is such a rule that successfully reads the file name, it can be logged. The problem is that the "name" of this field can be different for every upload form. For example on dropbox.com it has a different name compared to box.com, so it can work for dropbox but not for box (unless you manually create a rule for each web site...).
This only works in a reverse proxy environment where the number of web sites MWG provides access to is limited and the web sites are known.
About confirmation - "Time in Transaction" for new requests in the access.log.
I had a look in the access.log - i could not see any time info at the end other than 0, (i added timer filed at the end). For example:
[26/Sep/2021:12:56:05 +0000] "" 192.XX.XX.YY 200 "GET http://applABC.testfusion.example.com/FA--UCM--eoji-testk5kex1qa-auxvm2.pod26.prd01nrt01pod03.oracle...HOT_JLSgKCTmLTP HTTP/1.1" "Business, Software/Hardware" "Unverified" "" 0 0 "" "" "0" ""0
Not sure, I'm missin something in the above to view timers for an end-to-end transacation.
Secondly, few other questions came up when discussing with the Akamai folks, as they cannot relax the 5 min window just because ICAP scan time takes longer. [ For example: some uploads are a single file and at times it is a zip or jar file, which we don't know].
But what we know is a filename starts with, and it is a jar file, and it contains.a property file inside that jar file:
So i thought if if there's a way to identify a particular file (ABC_***.jar) when gets uploaded, and inside that jar file there is always a property file (ABCChecksum.properties). So in this case, if I can identify this file which always starts like ABC and it is a jar file and also it contains that property file ABCChecksum.properties, then i need a rule just to bypass that condition only.
How can i achieve the above rule if possible, so there's no change need at Akamai end, as to implement to bypass rule only to allow that file alone without a full scan.
Please can you let me know for the above : ( to view timer information and to add rule very specific to a single file that contains a property file like that mentioned).
a rule engine trace could help for the Timer.TimeInTransaction. It should not be 0 according to my knowledge, and the rule looks OK.
I cannot answer the second query without an example. As mentioned, if the file name is part of the URL it is easy to look into the file name, if the upload is done by a form and the file is just a field in the form it is not that easy (I outlined the details in the previous post).
If you have a rule engine traces and maybe a connection trace I can take a look. It is not really possible to bypass based on the file within the archive (ABCChecksum.properties). We need to extract the file to see the content and once we started extraction all the extracted files will be scanned. There is no way to say "do not scan anything" after scanning was started. It would be possible to bypass all remaining objects after finding the "ABCChecksum.properties" file, but noone can tell if this is enough to not exceed the 5 minutes.
Generally the 5 minutes are a problem. Even if you whitelist the .jar files, if someone sends a Zip or a PDF which takes > 5 minutes you will run into the same problem again.
I am happy to take a look, but I think I need concrete details and data to work with, the query is too specific to make a general answer.