as of today, the VSE DAT file (version 7175) says it includes 668,629 detections.
However, when exported to a text file using CSSCAN.EXE -virlist, the full list only has 108,634 items (including variants).
Am I missing something?
CommonShell Command Line Scanner Lite (VSCORE.18.104.22.1686)
Engine Version : 5600.1067
Engine Load Time : 3760 milliseconds
AV DAT Version : 7175.0000 668629 detections Built 22 August 2013
Extra DAT : 0 detections
Please wait ... retrieving list of names from the Anti-Virus DAT
4ArcadePBar Unwanted Object
7AdPower Unwanted Object
It's not a list of viruses, but a list of drivers so the number isn't a good representation of how many pieces of malware that the DAT's have in place. Drivers have many detections.
It doesn't match viruses. We use it against a zoo filled with samples that number in excess of 50 million. Maybe even north of 100 million. I haven't seen accurate numbers lately (and they inflate by tens of thousands per day). The current DAT files are proof against that large corpus of viruses AND the fingerprints of stuff within GTI.
On a side note trying to accurately measure the quantity of viruses is very troublesome. Do you measure files? Samples? Names? Categories? Types? Fingerprints? For example, we can write protection against something we don't have a sample for. We have samples but that doesn't include all variations. And certainly you can have a single virus attach itself to multiple files thereby creating new "samples". We had trouble figuring out these numbers back when we had 50K samples (circa 2005). Now we see that much in a single day.
You ask an academically interesting question. The answer isn't easy and requires a lot of explanation. Even if you get an answer I'm not sure how useful it is to compare to anything else.