VirusTotal is a subsidiary of Alphabet Inc. (which is also the parent company of Google). The service offers static and dynamic artefact analysis through a combination of free and paid tiers of access, as well as access to broader intelligence harvested from submissions and their own honeypots.
The Virustotal service is quite popular amongst the Information Security profession in performing quick analysis of artefacts, however there are some drawbacks and other aspects to consider before implementing VirusTotal as part of your DFIR stack of tools.
Table of Contents
What are VirusTotal’s Capabilities?
VirusTotal (VT) utilizes a number of external services to harvest off and aggregate analysis of submitted artefacts. These services range from url reputation searches right through to dynamic malware sandboxes.
The reports from these services are aggregated into the VirusTotal service and presented to the user using the SHA value of the submission to differentiate the submission (so if more than 1 person uploads the same file, the first occurrence being uploaded will be presented to the subsequent submitter).
Submissions are made to the service through a combination of mechanisms, and this may include the desktop submitter, API, web interface, or browser extensions. The returned data to the submitter of the sample varies based on the platform they are using.
For example, the browser extension may just present a summarized score to the end-user, whereas the web report for the same sample would describe more information about the sample.
The reports will provide output of automated analysis with extracted information being categorized based on the outputs of the external services which have conducted the analysis.
The scoring for the sample (e.g. 67/70) shows how many of these external services have determined that the same is malicious, so whilst we know and understand that Wannacry is malicious, a result of 1/70 could also indicate a malicious sample, and could be due to a different technique employed by an external service to categorize a detected artefact.
Strings and behaviors extracted from the sample are also associated with other submissions through force-linked graphs which would allow a paid account to pivot through the sample to other samples.
So now that we have described how VirusTotal operates (in a very small nutshell) we should identify the advantages and disadvantages of its use, and structure up some risk management guidance for its use (yes, VT does have risks which should be managed).
VT’s usage of external services to provide a fair assessment of an artefact is a smart move, and allows for differentiation between the engines in determining what is a true-positive and what is a false-positive.
When a sample is submitted to VT, the sample is passed off to a number of these engines either immediately, or a queued up for analysis and subsequent reporting through to the VT sample report page.
A short list of these engines including:
- Crowdstrike Falcon
- Palo Alto Networks
For a malicious binary, these are just a small sample of the total 70 external engines where your submission will be provided. However, any new detections would also be rolled up into these platforms fairly quickly after receiving the sample to aid in protecting their respective customers and their platforms.
From experience, I have submitted a new Emotet sample to VirusTotal and within 3 hours have had an updated signature incorporated within Microsoft’s Windows Defender within 3 hours. So there is value in submitting known malicious files to this service.
There are a number of great reasons why you would use VT, however there are also some potential confidentiality issues here too. Whilst not many businesses would consider this, your submissions to VT are relatively public, with samples being downloadable by anyone who has access to the paid platform.
This means, any files submitted which may contain Personally Identifiable Information (PII) or be considered commercial in confidence, will no longer remain confidential. These submissions themselves could also be in violation of a number of privacy acts and legislations, which may put the submitter and the business at risk of legal repercussions.
An example of an instance how VT should not be used may include:
An email which appears to be spearphishing submitted to VT may contain the recipient information, and depending on the campaign may contain email correspondence as part of pre-texting the target (i.e. the Emotet tactic for email harvesting and redistribution).
The email submitted to VT would be available to any and all on the platform who would have access to the paid submission, and depending on your commercial arrangements (with or against the prescribed external analysis services) may also be a breach of confidentiality.
Managing VirusTotal Risk
Risks concerning VT usage need to consider the threat and vulnerability which may be exploited.
The vulnerability being the information submitted to VT may be sensitive to a person or to the business, and the threat being an unauthorized person accessing the information, or an inappropriate person being granted access to the information.
The risk statement in this regard would read something along the lines of:
“Unauthorized disclosure of sensitive information submitted to external malware analysis services.”
Now the risk of VT usage can be measured, and appropriately treated under your business’ risk management framework. If however, you are not in an organization which has this in place, and you need some sound guidance on implementing appropriate controls, read on…
Risk Mitigations for VirusTotal Usage
In the example of submitting information outside of the organization to an external entity, the use of Data Loss Prevention (DLP) technologies may aid in the identification and interdiction of sensitive information being submitted externally.
This may also aid in preventing SOC Analysts from erroneously submitting samples externally, and also reduce the likelihood of a well-meaning employee submitting their own submissions to VT (outside of the control and visibility of the organization).
Should DLP not be deployed or available within the organization, a SOC analyst may also need to make an assessment as to the content of the artefact being submitted and make a determination as to the presence of privacy affected information.
A classification schema should be utilized to better make an assessment to the information’s classification, and it’s appropriateness for external distribution. This may take the form on an Enterprise Security Classification policy, or could even be the implementation of the Traffic Light Protocol where information needs to be categorized within security tooling.
There may be other controls which could be implemented, however this was just a quick recommendations of what would be considered less technically complex controls to implement.