I've got a handful of questions about the composite opener that I hope someone can help enlightment me on! Feel free to cherry pick and answer any of these if you know em!
Here's why I ask:
The was an early Mcafee consultant set up one of my client's rulesets, There are some top level rule sets between Common Rules and the gateway anti-malware, and ReqMod rulesets. I've observed that between the composite opener and the anti-malware rule, some large downloads can really really take forever, so for common vendors they'd like to bypass these time consuming rules when downloading content they largely trust by way of a vendor relationship.
In those cases, I'd like to bypass the opener as well as AV, but still run the rules (such as icap to the DLP appliances) in between Common Rules and Gateway AntiMalare. Unfortunately, the Common Rules "response whitelist" that's built in to that ruleset template does a stop cycle rather than stop ruleset. This causes client's icap dlp rule to not get evaluated as it's currently positioned after common rules in the ruleset.
I'd feel more confident rolling my own here if I understood the composite opener's function and purpose better. It sure does seem to dim the lights on certain installers with long long download and scan times.
Thanks so much for any insight!
1.) It calls the composite opener event. This event causes MWG to start extracting the current body, if possible. If you are going through the rule engine in the response cycle and have a Zip file in the body, calling the composite opener causes MWG to start extracting the Zip file, sending all archive members as individual "embedded cycles" through the rule engine.
2.) This will cause the composite opener to not be called if the archive has more than 5 levels. If you have an archive within an archive within an archive each of the embedded archives goes through the rule engine (embedded cycle) and the composite opener rule is triggered again, causing the next level to be extracted. With this criteria the composite opener will not be called and the current object will not be extracted further.
From my experience long scan times are usually not caused by a high number of levels, but complexity of an archive. Setting this to 5 may help in some cases, in some cases it won't.
3.) Enabling the composite opener influences the number of rule engine runs. For each extracted object the object runs through the rule engine and through all rules which are enabled for the composite opener. For most rules it does not cause any difference. Rules such as Progress Pages or Call ReqMod usually are not enabled for the embedded cycle, but for request/response cycles only. For things like Media Type filter rules you will have embedded cycle runs which cause the extracted objects to be checked against the rules you have setup, e.g. block all executables or similar.
4.) A file needs to be downloaded completely before MWG can start extracting it. The extracting belongs to the scanning portion.
5.) The default rules should be pretty fine. You should have the latest MWG versions to avoid long scan times and report files that taking extremely long to support to find out if this is by design or if there is room for improvement.
6.) There are some examples in the community how to bypass by maximum file size. You can use these examples as a criteria to the enable composite opener rules and/or for AV. If there is a specific file that takes long I would try bypassing the composite opener first and leave AV in place, which gives you at least a chance to detect something in the raw data.
To skip composite opener and AV without intefering with the other rules add whitelist rules per URL of Body.Size into the rule sets you would like to skip:
MWG 7.x includes a number of "Openers" - special components that are handling different file types - archives & documents (really, there is no difference between them from MWG's point of view). They're extracting embedded objects & text from them.
Without any additional handling, no objects or text are extracted. "Enable Composite Openers" action is responsible for enabling data extraction for current stage. And you can control when data extraction should be performed - only in request cycle, until Nth level of archive (via Body.NestedArchiveLevel property), only for some file types, etc. - this differs from MWG6, where you able to enable extraction for all archives or documents. So if you know, when you want to skip Composite Openers, then you need to modify corresponding rule, so Composite Openers will be enabled only when you won't do ICAP scanning. If you don't need to process embedded objects at all, then you can remove this rule completely (although it maybe not the best idea)...
When composite openers are enabled, then after processing of file, MWG starts to extract embedded objects and submit them for filtering in "embedded objects" cycle (this main reason for increasing of scanning times, as we also need to process all embedded objects). Composite openers should be also enabled if you're using properties like: Body.Text, Body.IsCorrupted, etc., as they're calling underlying file type handler that performs actual analysis.
I need to say, that AV has it's own set of file type handlers, and this sometimes lead to double unpacking of data - first time by MWG, second time by AV. But you need also to take into account, that the list of supported file types in MWG and in AV isn't the same - MWG supports more file types, so more objects could be analyzed. Similar thing could happen for DLP appliance - I don't have a list of file formats that is supported by it
Regarding slowness when enabling composite openers - you need to take into account that some archives, like .msi files, contains thousands & dozen of thousand of embedded objects, and each of them is scanned with AV, processed according to policy, etc. So, you'll need to think about data flow inside MWG, and decide when to enable composite openers, and when not.
Regarding Body.Size as condition for enabling composite openers - I'm not sure about it - maybe it's better to use Body.NumberOfChildren property, as most of slowness is caused by big number of calls to AV, not size of file to scan.
I hope, that this information will help you - If something is still unclear, I'll try to answer to additional questions.
Thank you both for some excellent insights! I'll check out some of the community ruleset submissions.
I don't think Bluecoat has an analog to the composite opener..at least not in the ruleset I'm migrating from Bluecoat to mcafee, so that indeed must explain why performance on these much beefier mcafee boxes seems doggy on large downloads.
7) Do you find yourself being compelled to bypass the opener, AV or both for your trusted vendors' download sites?
8) For those who manage proxies for largish environments, what body size do you pick beyond which you skip AV or composite openers? What's been a livable number?
9) Stop cycle vs stop ruleset for Response Whitelist to bypass the Opener -- any hidden gotcha's ?
e.g. for a ruleset that looks like so:
Response Whitelist -> stop cycle
Enable Composite Opener
ReqMod Call ICAP for network DLP
Gateway anti malware
The library "Common Rules" choses a stop cycle for the response whitelist of URL.host patterns. In the context of the policy sample above, that means anything put into it would miss out on DLP as well as AV. Would you agree that I can safely change the Common Rules "Response Whitelist rule" from its default Stop Cycle to a Stop Ruleset without being too worried about messing up its functionality? I ask because Common Rules I believe came from a library and I'd hate for the stop ruleset to have some consequences I hadn't anticipated.
Thanks again - this discussion has been extremely helpful
7) I have heard this question/requirement a couple of times. Many people ask for a recommendation or an automatic updating list of download servers for "trusted vendors". With the McAfee Maintained Lists feature we would have a great chance to provide lists to customer with download URLs which are used by vendors such as Microsoft or Oracle, which are typical candidates for whitelisting.
The big challenge we have here (and this is why such lists do not exist at present): What is a trusted vendor? While talking to customers I found that many want us to define vendors as "trusted", but everyone has a different understanding. One customer wants to whitelist Microsoft, Adobe, Sun, Oracle and a couple of linux distributions, the next customer thinks that especially Adobe should filtered and does not want to have linux within his company at all, the next one wants to filter Microsoft updates because he read something may happen with the updates, etc etc.
So basically we would end up having several lists and tons of requests of changing those lists. If I change a list for one customer the next one will not accept the change, etc. Additionally we could get some trouble with this. If I (as a McAfee employee) tell you that I would whitelist xyz.com and some of your users get infected by requests from that URLs there may be fingers pointing at me. So for the meantime we have not provided such lists but give recommendations to customers to allow them to whitelist on their own whatever they think is trustworthy.
However I agree we should provide lists with example URLs that may help to find good whitelist entries - maybe in the future :-)
8) I do not manage proxies for large environments, but I personally would not bypass filtering based on the file size. The size does not tell anything about "how long a file needs for filtering". Even a 10 MB file could take 15 minutes or longer to scan, depending on how it looks inside.
Instead what you can do is looking at the Connection.RunTime property. You could put it similar to the archive level properties to the rules which triggers the composite opener. You can set it to 5 minutes for example, and skip additional composite opener runs (e.g. continue extracting) if the file is already being handled for 5 minutes.
Please note that this is just for the NEXT level or archive depth. If something has been sent to the AV engine for filtering and filtering takes 15 minutes the proxy will not be able to abort the scan and deliver the file after 5 minutes. Also if processing of a single archive takes 15 minutes the timeout will not trigger. It will simply skip further extracting embedded cycles after the timeout hit.
I used one of the HP drivers you posted in another thread as an example. Without the timeout it takes approx. 950 seconds to filter. So I added a timeout of 3 minutes and it started to extract, but stopped calling additional embedded cycles after 3 minutes. In total the download took 500 seconds then.
This is still a lot of time, but this was the time consumed to filter the first level of embedded objects. So you can certainly speed up things (depending on how the archive looks), but as mentioned you cannot have a "forward file after 5 minutes" rule.
9) If you want to place any "skip opener" rules on top of the rule that triggers the "Enable Composite Opener" rule I think "Stop Ruleset" is fine. Note that this will certainly cause AV etc to happen. In some cases this may be what you want, in some cases you want to skip AV as well, so the action should reflect what you would like to do :-)
I hope this helps.
Yes, that is helpful.
I like your idea of doing a time based serve-up of a file. Could you give me some more details of what the Connection.RunTime looks like and how the condition is applied?
Nah, I wouldn't want McAfee spending their time in the "maintaining a list of trusted vendors" business -- as you properly cite, that list is unique per-customer. That question was aimed more at my fellow poor bastards who have to administer these in a customer environment and deal with the griping all the way from execs to interns when things from the web proxy take tons longer than they could even get them on their 4G smartphone. Specifically, I'm pondering how one might whitelist a vendor's Downloads section and the support portal section of vendor websites (for support case file uploads) to get past time-consuming scanning, but still keep scanning and full protection for the more user-generated-content portions of the sites such as forums and community uploads, etc. Obviously there is some maintenance involved there, and I'm wondering if the community has examples of how they've possibly tacked it for various vendors to balance risk security and productivity.
I imagine anyone in the shoes of a proxy administrator would quickly realize that people waiting 15 minutes for a file to scan that took less than a minute on the Bluecoat ProxyAV solution that was in place the prior week... would not be acceptable. :-) 15 minutes is a LIFETIME to wait for a file to come down unless it's several hundred MB. *include <Classic 'security vs getting business done' discussion here>.
Thanks as always for the info interchange!
Here is a screenshot for my rule. It is very simple:
As I mentioned this is not a perfect solution. It would be cool if you could just stop after 5 minutes and forward the file, but there is no way to stop filtering that already started. Not yet at least. It is more another way of moving into the direction you would like to go to instead of a solution.