Cloud Services - a gimmick?

Can someone explain the new Cloud Services to me? Based on the available information, they seem like a gimmick.

Take the Reputation Services. It supposedly compares a file on your system to the experience other Avast users had with it, or in the words of Ondrej Vlcek, CTO of AVAST Software “Once a new file has been opened a few hundred times, we have a good idea if it is malicious or not, then we communicate this information with all of our users”. But the experience of other Avast users will be based on the Avast definitions and heuristic analysis, which are already shared by everyone. Therefore no new or additional information is introduced in forming the Reputation Services assessment.

To put it simply, there’s no point in checking what the experiences of other users is; being based on the same definitions and heuristics everyone uses, the experience will be the same as yours. Therefore the Reputation Services don’t really add an additional layer of protection, they simply reuse the one already shared and used by everyone.

As for the Streaming Updates, it seems like a worthwhile idea, but with avast already updating several times a day, i feel the sense of security they provide is more psychological rather than one that drastically increases actual safety.

I welcome other opinions on the matter!

I haven’t thought of it as a gimmick, but I have been trying to assess its benefits and what one may lose by disabling the cloud features. I still have a ways to go on that front but a few thoughts.

One place to start would be to compare avast 6 reporting to avast 7 reporting in order to help identify what new/additional information avast 7 sends to avast. The avast 6 WebRep component sends some info to avast, but I can’t remember seeing or hearing of other information being sent to avast(?). The avast 7 WebRep may send the same or more, plus there is the FileRep component. Even just limiting the context to files, a considerable amount of information could be gathered about what is in the wild, where files were downloaded from, etc. Analysis of that body of collected data could reveal more information about how files spread, how common they are, what other files were present on user’s systems when file X was present, etc. Just how massive the data collection is from user machines’ I don’t know, but I’m thinking avast is gaining vastly more information of that nature. Which theoretically should allow them to better identify some things, prepare related definitions, so forth.

Then there is the potential to feed additional information to the avast 7 program so that it can come to a more refined analysis/conclusion, which in turn could and I think would be reported to avast and feed into the first part above. One question I keep coming back to is: are definitions 100% comprehensive and sufficient to describe all known malware files (just by themselves or at least in combo with client side analysis). I suspect, but do not know for a fact, that is not true due in part to polymorphic malware and constraints on the total size of the definitions. Point is, storage should not be a weak point of the cloud and it can hold detailed information about a massively greater number of files than could or would ever be delivered to client machines. Real time cloud lookups could be performed to acquire detailed information about a file before it is executed. Theoretically, that type of lookup could provide additional information which is used to adjust the sensitivity of the avast 7 program analysis engine and possibly also call for more information about the file to be uploaded to avast (if, for example, your machines is the first to see it).

Malware need not and I’m guessing often will not behave exactly the same way on your system as someone else’s. It should not be assumed that everyone will have the same experience even if they have the same definitions and any other data from avast. I think there are and always will be many “first ones” as in those whose machine is the first to see the ugly side of a program. Unfortunately, I suspect a program could be in very widespread use for a long time and be signed by a well-known manufacturer and have gone through all of avasts cloud, program, etc analysis steps and still represent a threat. I don’t know how many real sleepers are out there, but I think even cloud assisted detection will be of limited use in protecting first ones.

So I’m thinking cloud features can theoretically be useful and more than gimmick. However, the greater the reporting the greater the potential privacy issues (and where the privacy issues can affect security, security issues).

I would also add that the assessment of the cloud related features must be done with an eye looking forward. Although avast 7 may now be a hybrid with adequate definitions to allow meaningful detection without the cloud features enabled… and avast 7 now has cloud services that work in a particular TBD way and offer TBD benefits… things will change and who is to say what they will be down the road.

In theory i agree with you, the more data collected by avast about various files in the wild, the greater the chance of correctly analyzing and classifying those files. But let’s look at what kind of data Avast takes into consideration. Here’s a link to an article describing the FileRep feature: http://www.avast.com/pr-avast-software-detection-is-faster-when-filerep-knows-all-the-clean-files

The article mentions that the attributes it takes into account are how recent/new the file analyzed is, how prevalent and quickly spreading it is, means of distribution, digital signature presence, and the source URL. The only thing there that i see as a valid indicator of infectiousness is the source URL, as it can be compared to a database of known malware domains. But this sort of protection is already widespread and implemented at many levels. As for the other attributes, a newly emerging, quickly spreading file can equally be an indicator of a valid program and of malware, so i don’t see how those indicators help. The presence and absence of a digital signature means little to nothing, as they can be counterfeited, and aren’t commonly used. It all seems like a great recipe for a lot of false positives.
I suppose the means of distribution is a somewhat valid indicator, as exe files in emails are a red flag, but that’s hardly a revolutionary mechanism of prevention, or a substantial one as this is not a prevalent mode of infection.

Another quote from the article says “FileRep is designed to counter the growth in polymorphic malware where every user gets an “individualized” version which traditional signature and heuristic AV techniques have a hard time identifying.”. So the file identifier in this case becomes it’s name? In which case if the name is simply randomized along with the contents the whole system becomes pointless, no?

It’s possible that the mechanisms of prevention might actually be more thorough than described in the article, and that my analysis of them could be limited. But based on my current understanding, the features seems like a gimmick with little to no use.

Well you are forgetting that if pinged by the file rep, firstly it stops it on the periphery of your system, the user has two choices (unlike the web shield alert and abort connection is the only option), one the standard abort connection and a second to download.

If the user selects download it will then be downloaded and run in the sandbox to try and further determine if it is malicious.

This really doesn’t address my concerns about the usefulness of the FileRep service. Yes, it does offer additional sandboxing scrutiny, but as far as i understand sandboxing simply prevents a file’s access to the system, and then analyzes the file using standard definitions and heuristics, which is an existing layer of protection. And the whole point of the FileRep service is to succeed where definitions and heuristics fail. If all FileRep does is submit additional files to definitions and heuristic analysis, then it’s not really a new and additional layer of protection. Also, there’s little point in increased sandboxing if it’s based on faulty mechanisms of identification.

I edited my previous message to include an additional paragraph about “first ones” as I thought that kind of touched upon something stlolth was thinking.

Although that PR article does mention some details I feel as though a more technical presentation or review of the protocol and its information fields would be helpful and possibly shed light on important aspects that may be glossed over in such a short blurb. The release mentions some (or all?) things which are tracked which I take to mean higher level things which the cloud knows. It doesn’t, as you touch upon, spell out how a file is uniquely identified in all cases. What if insert a USB drive and attempt to copy a file to my machine? I’m assuming FileRep or similar functionality would be applied(?). What exactly is sent to avast during various stages of the process of opening/executing that file? Just a hash? I think no. Hash plus “digitally signed” boolean? Thinking not a boolean. Some fields from FileVersionInfo? That seems tempting to me, but I really don’t know.

I always thought that cloud service was a comparison of other files and websites collectively shared across different virus scanners and security platforms? Isn’t something like Virus Total cloud based? Is Avast’s virus lab getting cloud data, exclusively from other Avast users? Or can the cloud communicate outside of the lab and analyze data from other respected AV programs in addition to the excellence of Avast? Could Avast Cloud service use Virus Total data in addition to its own data to really get a heads-up on stopping malware now and in the future?

I don’t see anything gimmicky about cloud service. Real-time streaming protection is still going to be faster than generally 2-4 updates a day from Avast. The multiple daily definition updates are great. But real-time streaming, I think adds extra protection, because you don’t have to wait for an fixed time interval to get updates.

Jack

I’ve admitted that there could be more to the mechanism of action than described in the article, but based on what is described it doesn’t seem very functional. Natually, there will always be “first ones”, as there initially won’t be enough cloud data to point to infectiousness, the article says the system works only after a file has already been accessed by a number of users. But my point is that even then it will likely fail, as the analysis of the file isn’t based on sound and effective premises. As for file identification, i’m not sure how much any specific file info would be of use, as the article states that FileRep aims to target especially files of which such values are random and variable. Therefore, only the file name is left as the identifier, which easily done away by random file names.

I’m afraid that’s not so. It’s collected from other Avast users. It would be great if there was such co-operation between various security providers, but that’s not the case.

Could Avast Cloud service use Virus Total data in addition to its own data to really get a heads-up on stopping malware now and in the future?

I’m not sure what kind of deal Virus Total has that various competing security providers allow it access to it’s databases, but i’m sure it has much to do with the fact that it’s not a commercial product or a full fledged security solution, therefore not a real competitor, but rather serves as an advertisment/preview of each company’s detection ability.

I don't see anything gimmicky about cloud service. Real-time streaming protection is still going to be faster than generally 2-4 updates a day from Avast. The multiple daily definition updates are great. But real-time streaming, I think adds extra protection, because you don't have to wait for an fixed time interval to get updates.

The least of my qualms was with the streaming updates, it’s a useful feature, but i doubt it’s impact will be drastic, as the frequency of updates isn’t among the top causes of faulty protection today.

Your approach is too “binary”.
Known malware files, known malware domains… sure. But things are often not black or white - the file may be just a bit suspicious, the domain can also be suspicious, and the more information you have, the higher the chance that the decision will be correct (and not a false positive or negative).

Really? I’d like to hear some more about all the points in that sentence :slight_smile:

No, there’s no connection to name whatsoever, why do you think so?
But apart from that, that’s just it. If you are the only one in the world who has that particular file, that makes it more suspicious than files used by millions of people in the world for many months; that’s the additional information the FileRep may supply (for example). So the randomization you are describing is exactly the way to draw attention, not to avoid it.

There’s no fixed mechanism… the rules are evolving, and certainly will be evolving much more in the next weeks/months; the potential is huge.

The file can only be a “bit” suspicious based on the attributes i’ve mentioned. And i’ve explained why i find them faulty. As for domains, i’m not sure if there’s a good mechanism of determining suspicion other than proof of infection, which is likely why nothing of the such is mentioned in the article. Anyway, all this vague and slight suspicion is a great recipe for lots of false positives. Now as DavidR mentioned, this is where sandboxing steps in, but as i’ve replied, sandboxing is dependent on definitions and heuristics, existing protections. Therefore it defeats the purpose of FileRep as a feature that succeeds where heuristics and definitions fail.

There has to be a connection with filenames as the article says FileRep is dealing with files who’s contents is randomized and changed on a per user basis, therefore the only mutual identifier can be the file name.

Your point about randomization is interesting. Still, i find it contradictory to the theme of the article, which talks about first pooling data from various instances of a file into a unified identifier before singling it out as suspicious. Which is the opposite of identifying a single file based solely on it’s attributes. I get the feeling you don’t really know how the feature works, but are making assumptions based on no actual information.

As for the points in my sentence, i find them to be very clear, what’s much more vague is what you object to and would like to see elaborated.

Not saying it’s necessary a “good” mechanism, but a Chinese multi-web hosting site that contained a number of infections in the past, though possibly none known at the moment, is certainly more suspicious than microsoft.com.

Well, if you think all “ordinary” detections are strict, matching one exact file, you are very wrong. The bad guys are trying hard to bypass the detections, so the detections have to be fuzzy, too. So yes, there can be false positives, but the same holds for detections without reputations. The days when everything (i.e. both the files and the detections) were simply good or bad, are unfortunately gone.

You seem to think that reputation is a standalone method / criteria. But that’s not the case - the information from the reputation is one of the inputs for the very heuristics and detection you are talking about. Especially the sandboxing part uses that info significantly.

Yes, randomized content - the content is used to get the reputation (which will be very low / none for the single randomized instance). Filenames are not used.

I’m not saying that the reputation is not built and used for files that are out there, sure it is - I was just reacting to your comment on the randomized names.
What I’m trying to say is that not only the presence, but also the absence of any reputation can be an important piece of information, useful for the heuristics.

Your feeling is wrong :slight_smile:
But yes, the system is kinda fuzzy - the data are being analyzed, looked at, various rules and criteria are being tried, so it’s not really possible to exactly explain what it does. And even if it were, it might be true today, but not tomorrow (the same can be said about the heuristic rules in the reputation-less detections). It’s simply an additional source of information - and the more information you have, the more you can possibly [try to] do to reveal the bad/suspicious stuff. How well you use this information, that’s a different question - but we’ll certainly try our best.

Well, first I’d like to know how exactly you would counterfeit a digital signature, it is not that simple.
Second, most programs from established manufacturers are signed (after all, since Vista the digital signatures are used a lot by Windows itself) - so “aren’t commonly used” is certainly wrong.
Third, “means little to nothing”… well, maybe for you it does, but generally the meaning is quite significant :slight_smile:

Anyone NOT familiar with “scoring”? That seems like it might be a useful term to use in some of this. Squeaky clean host = score 0, nothing but a malware distributing host = score 255.

Ordinary detections can only be fuzzy based on heuristics, those based on definitions match a certain specific signature, so they can’t be fuzzy at all. They can be false, but that’s not being fuzzy, that’s simply a mistakenly marked signature. The definitions system is aimed at reducing “fuzzyness”, whereas the FileRep system, as described in the article, seems like a breeding ground for it. I’ve made several clear points on why this is so, you haven’t actually countered any of them directly with clear arguments. What you’ve done is pointed out a good mechanism of action not mentioned in the article; the lack of reputation, and tried to dilute the argument by claiming all detections are fuzzy.

The point is that the FileRep mechanism of action described in the article aren’t just fuzzy, they’re nonsensical. They are aimed at behaviors shared by both valid and malicious programs, rely on already implemented methods of protection existing outside the Avast product, or depend on falsifiable methods of verification.

I’ve noted that the sandboxing uses FileRep, but it uses it to decide which file to sandbox. From there the analysis is dependent on existing tools, heuristics and definitions. Based on FileRep’s mechanism of action described in the article, there’s hardly enough significant data for it to be used as an actual or conclusive heuristics engine, it’s aim is too wide. So it only makes sense that in the end FileRep depends on sandboxing and standard heuristics and definitions, but the problem is that it’s advertised to succeed where heuristics and definitions fail, and they’re equally fallible whether they’re tipped off by FileRep or standard scanning.

A file in the wild can either be identified on it’s content, through some form of standard file descriptors already mentioned by FlyingRobot or on it’s name. Now if the file changes it’s content on a per-user basis, as mentioned in the article, the standard file descriptors (like hash) fail, and it can only be identified by it’s name. Sure, i suppose the FileRep definition could be analyzing the complete file contents, and focusing on one unchanged aspect of it, which would be the malicious code surrounded by obfuscating information, but i highly doubt FileRep is doing that as it would be heavy on both system resources and data transfer.

As for digital signatures, it’s beside the point if i know how they’re counterfeited, the point is that it’s well documented that they are. And not only counterfeited but flat out stolen. They are used by many large software companies, true, but the thing is, even though the most used programs on my system are by large manufacturers, they’re outnumbered by various small utilities which aren’t digitally signed. Though large developers have the strongest presence, there’s a larger amount of “off-brand” software out there that’s not digitally signed. That’s why relying on digital signatures means little to nothing.

I suppose your definition of “fuzzy” is different from mine.
Sure, matching a specific signature is somehow “strict” - but if, for example, the signature is (on purpose) chosen that weak that it’s hard to predict what it’s actually gonna detect, I’d call it “fuzzy”.

I certainly can’t agree with that. FileRep means more info, more info means less uncertainty / fuzziness.

I certainly haven’t said that all detections are fuzzy. But many detections are - they are based on heuristics, evaluating common properties, possibly not related to malicious behavior… but they still work.

You think they are nonsencial, I would say that if anybody thinks about them, they actually make a good sense, so we won’t agree on that.
But even if they were nonsencial - who cares? We have huge sets of malicious and clean files and can easily measure whether the method works. And if it does (meaning it detects malicious files with little or no false positives, that can possibly be excluded somehow), even if it works based on strange data… it just works. The same is true for reputation-less heuristic rules.

Again, I’m not saying FileRep is the only criteria for the decision, or that FileRep is a completely standalone thing that doesn’t interact with the rest of the antivirus - that wouldn’t make sense.
You seem to be fine with “heuristics” - which is a generic term that can cover mostly anything, but usually means something “fuzzy”, decision/guess based on incomplete input data, but you refuse to accept that FileRep can be a useful additional piece of information for that very heuristic.

So FileRep is not mean to “win over heuristics” - its purpose it to improve the heuristics, and to make it possible to include new heuristic rules.
If your definition of heuristics has some very strict rules that don’t include online queries… fine, then it’s a new “uberheuristics” - but that’s just terminology.

The file is always identified by its hash - identification by its name is hardly relevant, you can rename your files however you wish.
The file is changed/unique for every user, the hash will also be unique for every user, nobody in the world has file with this hash ==> file is suspicious.

Stolen, sure. Counterfeited… well, let’s say we do our homework.
But again - various small utilities without a digital signature are somehow more suspicious that those big programs from large manufacturers. And this “little to nothing”, even if it were so, is simply another little piece of information into the puzzle. And the puzzle (= the existing heuristic rules) are build on many similar pieces of “little to nothing”, that’s what the heuristics is about.

Anyway, I think I’ve spent a bit too much time trying to explain something, and I probably haven’t made the point through, so I guess I’ll rather do something productive instead. The FileRep system is new, and it will certainly evolve in the near future. We’ll see how it goes - but so far, it’s been doing pretty well (in terms of helping to discover unknown malware).

Thanks Igor for your time, patience and knowledge sharing :slight_smile:

stloloth, I may not be up on current tech lingo about Avast Free Cloud but I did a recient scan of my computer using Avast Cloud and it did not…I REPEAT DID NOT get rid of a very irriatating Dame Ware Trojan that is effecting my P Computer. I scanned my computer both online and off line, it seems scanning it works better offline; the Trojan is in question is called “Mini Dump” Sierris and gets into your temporary files from your Internet Explorer from even your signup page on MSN.com and Ca.MSN (MSN Canada). What it does is infiltrate your computer in about 1 1/3 of a month weakening all your computer systems destroying your C Drive, or that is what it did to my old P Computer that I had for 18 years. Once it infiltrates then Avast does not recognize it for what it is a very nasty Dame Ware Mini Dump Trojan; when you start reformating your computer it seems to take more and more out of C Drive, leaving any backup drives intact but your computer ruined. It costs Personal Computer Users alot of money to buy very good Anti Virus programs and it costs still further money for updates every year while Hackers and Crackers buy very similiar programs or even very dangerious Remote Access programs which are even more and more tech against the computer users, I feel as a Personal Computer User and an internet user hung out to the cleaners by the battle between Anti Virus computer companies and the Internet Server Companies creating programs supposedly used by Internet Service Providers but it windsup in the hands of the Hackers and Crackers whom should not have those programs but do because they can afford to buy them.

Of course you’re not referring to avast! free antivirus…

I agree with Stloloth and Igor; even to Avast Tech trying to help out…lets see if I get “fuzzyness” straight?! Ok if you are either in email and you get a suspicious email that sends you to a blank page or one that you think is copied with info and viruses put into it then your AV is less sure weither the information you get in there is legal and not hacked. From what I seen in “Fuzzyness” at signinpages at websites on MSN.com allow these sorts of Viruses and Trojans to invade the browsers, not only is MSN IE infected but Google, Firefox and many others can be too all at the same time. That is what makes it so scary that Remote Access Programs can be so devastating as it was for my own older computer. The fact that an antivirus program cannot detect these things makes it all the more important for Computer Antivirus companies to make sure their Free Online AVs are atleast 75% as good as their storebought Anti Virus programs. Avast Techs if you don’t want the computer users to have their computers be “expensive doorstops” then get your programers together on updating Avast Cloud to recognize RA programs as threats.

:frowning: No I had Panda AntiVirus on my computer for nearly 2 years and they after 1yrs hinted I had to send them $20 Upgrade fees to have their antivirus storebought program updated…so I sent them the fees for a year and 2 months and my computer still caught internet viruses and trojans while I was a customer of their’s, I switched to Norton Anti Virus but it slowed my computer down so much with updates and upgrades that My computer was spending more time upgrading and updating than I was being online so I had to uninstall Norton Antivirus; atleast Avast Free Anti Virus strikes a good balance between updates, protection and a person being online. If your program instore is as good as it is in Free Online as it did when it detected the Mini Dump Virus of the Dame Ware Company, in about 2 months I just may buy your Avast Program at my local computer store, I had Zone Alarm also about 6 years ago that was one very good program and let you know where the virus programs and trojans were comming from.

I don’t think you can buy avast! at a computer store., There are no boxed product versions.

I do not think the new cloud features are a gimmick at all. They are in their infancy, especially the FileRep part, so will not, and can’t be expected to, work as well as they will 6 months from now. The streaming updates are already working very well and can only decrease the possibilities of being infected. These features are things that a certain competitor, Symantec, has been using for years now and they have worked very well for them and have received nothing but positive reviews as far as I have seen. I’m sure they will significantly increase the effectiveness of the avast! product in a relatively small amount of time.