I think you can ping away, but I would have thought a detailed answer would be classed as ‘commercial in confidence,’ so you wouldn’t get one, especially not in a public forum.
I also don’t think you’ll get a detailed answer - because it’s basically the “know how” behind the antivirus scanners (in addition to a team working hard on the detections).
Let’s try it another way
How does CLAM AV do the fast scanning for viruses in the files?
This program is open source and can be analysed by everyone. I am just not too technical savvy to understand the code.
Any new tries?
Well, good question…
But won’t we find a good answer if we ask Clam team/developers… I think Alwil team won’t have ‘time’ to look at that code… if they have and discover ‘anything’, they won’t tell, eh?
I really think there is a basic concept how virus detection works on all virus scanners.
Cannot imagine that every vendor reinvents the wheel completely…
My question is also not directly targeted to alwil, but to everyone who thinks is an antivirus expert and really knows how it works behind the scenes.
Well, the simplest thing you can do is use hashes (hash the whole file and match the result against a database of known infected hashes - binary division is quite fast).
More complex, you can use some multi-string search algorithm (e.g.: http://en.wikipedia.org/wiki/Aho-Corasick_algorithm) to look for known patterns.
Then, you add some algorithmic detections…
…
now we are heading in the right direction.
The Aho-Corasick hint was the missing link I was searching for.
Googling for the term with some other keywords really revealed what I was looking for.
It helped to understand more what is really going on.