The Umbrella Investigate integration with Cisco AMP Threat Grid shows samples from the ThreatGrid database associated with a domain, IP or URL that you’re looking to find out more information about. Information about samples is provided in the form of checksums associated when looking up a specific host or IP.
Once information about samples has been gathered, there is a second pivot where the Investigate API will look up checksums and reveal additional information about the individual malware samples, including behavior of the sample on the network. The information about what the sample does can then lead back to Investigate to find more clues about the network infrastructure used by the malware you’re researching. Additional research can be done in regard to the severity of the malware samples and the samples’ behaviors when they were analyzed.
The two primary endpoints for this API are /samples/ and /sample/. /samples/ gives all the sample information about a specific domain, IP, or URL and once that information has been obtained, the /sample/ endpoint can be used to dig into those samples further and reveal the artifacts, connections, and behaviors of those samples. Using information about the connections of the malware samples, you can continue to investigate related domains and IPs.
An alternate use case is taking a list of checksums from another threat intel source, such as an in-house SIEM or third party data feed, and review those checksums against the /sample/ endpoints to find more information about the threats.
API USAGE AND LIMITATIONS
All requests to the API endpoints are GET requests and follow the standard authentication scheme for any other Investigate endpoint. For more about authentication, refer to the 'About the API' section of this guide.
It’s important to understand that some perfectly normal, safe domains, such as www.google.com, will have malicious samples associated with them because the malware uses that domain or IP to check internet connectivity, or even as a source to determine more data about the host on which it resides (such as the public IP of the infected host or network).
Not all data from Threat Grid is in Investigate. Some of the data has not been ported over as it doesn’t apply to the way Investigate should be used for security incidents. Some legacy information has also not been brought over, although if Threat Grid were to see this sample again, it would become part of the data set.
NOTE: Portions of the functionality described for this feature of Investigate are only available for customers of both Threat Grid and Investigate. If you are an existing Thread Grid customer and do not see the details outlined in the areas marked for “Threat Grid Customers Only”, please contact Umbrella Support (umbrella-support@cisco.com).
If you would like to add Threat Grid to your existing license, contact your account representative
Usage
All requests to the API endpoints are GET requests and follow the standard authentication scheme for any other Investigate endpoint.
The two primary endpoints covered in this section are "samples" and "sample":
- The /samples/ API endpoint to research a domain, IP or URL and obtain sample information
- The /sample/ API endpoint to research a sample in depth
Within the /sample/ endpoint the following information can be found:
- Artifacts and pagination of artifacts
- Samples of samples and pagination of sub-samples – Threat Grid Customers only
- Connections of a sample and pagination of connections
- Behaviors of a sample
Glossary of Terms
The following terms are used by the API (and the UI) and are defined here, in advance, so it’s clear what data is being represented. A full list of all terms used by Threat Grid can be found here—or Threat Grid customers only.
Sample
A sample is a type of file, or even a file-like object, such as a process running in memory submitted and analyzed by Threat Grid.
Artifact
Artifacts are files that are created or modified during a sample analysis. Malware executables often download additional components and infect or modify system files, documents, and running processes. Artifacts are sometimes executables which can also be analyzed in Threat Grid and become a sample.
Behavioral indicators (Behaviors)
Key traits and behaviors that have been identified as indicators of malicious activity. Behavioral indicators include threat severity levels, HTTP Traffic, DNS Traffic, TCP/IP network sessions, processes, artifacts, registry activities, and more.
Connections
A network connection made by this particular sample to an IP or a domain, along with information about whether that IP or domain is known malicious.
The /samples/ API endpoint to research a domain, IP or URL
The /samples/ endpoint gives all samples associated with a specific domain, IP address or URL. This endpoint takes input in the form of a domain, an IP, or a URL using standard formats for each. This endpoint will return all samples associated with the domain; the default maximum number of responses is 10, but can be extended.
Typically an error is seen when the requested host (domain/IP/URL) is not in a valid format, if the requested host is not found in our database, or if there is no data available for the domain or IP you’ve requested. Please check your input to ensure the format is valid
NOTE: CIDR subnets (eg: 10.10.10.0/24) are not supported and neither is the pattern search.
Sample error:
{
"error": "No data available for 'host.com'"
}
The basic query format is "https://investigate.api.umbrella.com/samples/[domain/ip/url]"
Sample query for a domain with a limit of 100 results, sorted by the threat score:
https://investigate.api.umbrella.com/samples/google.com?limit=100&sortby=score
Sample query for an IP address with a limit of 15 results, sorted by last seen date:
https://investigate.api.umbrella.com/samples/195.22.28.196?limit=15&sortby=last-seen
IMPORTANT NOTE: URL queries must be encoded and must include the protocol! The example below outlines a typical example of that.
Sample query for a URL with an executable included, with a limit of 10 results:
https://investigate.api.umbrella.com/samples/http%3A%2F%2Fwww.hoarafushionline.net%2Fhabeys.exe?limit=10
limit
The number of responses; default of 10 as a limit on response, can be extended.
offset
Default to 0, used to pagination between sets of data if limit is exceeded.
sortby
Default is score. Choose from [“first-seen", "last-seen", "score”]. The syntax is appending “sortby=first-seen” to the end of a requirement. "first-seen" sorts the samples in date descending order from first to last.
"last-seen" sorts the samples in ascending order from last to first.
"score" sorts the samples by the ThreatScore.
Returned value for output if Success 200
We've broken the response down into two sections. All types of query will have the following return values:
query
string
What string was queried or seen by the API.
totalresults
integer
The number of results returned. Same as limit if limit is reached and moreDataAvailable is true.
moreDataAvailable
boolean
If more data is available. Extend the limit and/or offset to view.
limit
integer
Number of sample results.
offset
integer
The offset of the individual entities in the query’s response; used for pagination.
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/samples/google.com?limit=100&sortby=score"
{ "query": "google.com", "totalResults": 10, "moreDataAvailable": true, "limit": 10, "offset": 0,
In the second portion of the return from the /samples/ endpoint is information about the actual sample. This return here has been shortened for ease of reading.
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/samples/google.com?limit=100&sortby=score"
"samples": [ { "sha256": "e9d3470c37dada28d5a32fb53a243c5b20def35bb01abf8f5403182cc2b91fdd", "sha1": "de182fdcc3c0d473b90a0df0ad14c2074d1e7c50", "md5": "282f80e8a2cf9e0e0dd72093787d99c6", "magicType": "PE32 executable (GUI) Intel 80386, for MS Windows", "threatScore": 100, "size": 192512, "firstSeen": 1460108539000, "lastSeen": 1460108539000, "visible": true, "avresults": [ { "signature": "Win.Trojan.Ramnit", "product": "ClamAV" }, { "signature": "Win.Trojan.Parite", "product": "ClamAV" } ] }, ]
sha256
hash
The SHA256 checksum of the sample. This checksum is important if you’d like to find out more about this sample in the /sample/ endpoint.
sha1
hash
The SHA1 checksum of the sample. As above, can be searched in /sample/ endpoint.
md5
hash
The MD5 checksum of the sample, as above, can be searched in /sample/ endpoint.
magicType
string
A ‘magic type’ is better understood as a file type. Specifically, it is the output of the Linux “file” utility.
threatScore
integer
The score given to a particular sample based on the analysis performed by Threat Grid. A threatScore is a measure of the amount of system weakening, obfuscation, persistence, modification, data exfiltration, and other behaviors which may be a threat to the host system’s integrity. It is intended as an overall threat indicator that can be used as a guide to the likelihood that a submission is malicious. The Threat Score is not an authoritative classification of good and bad software.
size
integer
The size of the sample in bytes.
firstSeen
string
The epoch time stamp for when this sample was first seen by Threat Grid.
lastSeen
string
The epoch time stamp for when this sample was last seen by Threat Grid. The lastSeen and firstSeen will often be the same if the sample is more recent.
visible
boolean
Boolean, either true or false. For internal Umbrella use only, please ignore.
avresults
string
AntiVirus results according to ClamAV. A sample can have more than one signature if it is possibly detected under more than one family of malware. A sample may also have no signatures associated.
The /sample/ API endpoint to research a sample
Once you have gathered the information from the /samples/ endpoint, digging deeper requires that you pivot using the checksums of the samples revealed in your initial query. This pivot will reveal large chunks of new data about the malware being researched.
The /sample/ endpoint returns a variety of data as nested JSON arrays. The initial results array contains the information about the original sample. These results are described first and are, in effect, the samples of the sample.
Underneath, the following results are nested in JSON outlining additional information:
Artifacts
Other samples associated with this sample, but that are not given a threatScore. Note that Artifacts are only available for Threat Grid customers.
Connections
Information about network activity associated with this sample, such as connections to other domains or IPs
Behaviors
Information about specific actions or unique properties of this sample, especially local to your network or the computer the sample is run on.
The basic query format for the /sample/ endpoint is is "https://investigate.api.umbrella.com/sample/{hash}"
The hash must be a valid MD5, SHA1 or SHA256.
Parameter for input
limit
integer
default of 10, can be extended for a larger data set.
offset
integer
the offset of the individual entities in the query’s response, used for pagination.
Three errors are typically seen. When a valid hash does not match any known samples, or when the hash value is invalid, or when the hash length does not match any known hash types.
Sample errors:
{
"errorMessage": "hash The provided hash's length does not match known hash type"
}
{
"errorMessage": "hash The provided hash does not validate as a valid SHA256 value"
}
{
"error": "Could not find sample for '3ee3cbe0ca92d2470f50712adf60fb03e4ad327fd78e630e004571b89db47cea'"
}
Sample query for a hash with the limit set to 100 and offset of 10:
https://investigate.api.umbrella.com/sample/e9d3470c37dada28d5a32fb53a243c5b20def35bb01abf8f5403182cc2b91fdd?limit=100&offset=10
Returned value for output if Success 200
All types of query will have the following return values. The first section of results will match those from the /samples/ API query earlier but are repeated here:
sha256
hash
The SHA256 checksum of the sample.
sha1
hash
The SHA1 checksum of the sample.
md5
hash
The MD5 checksum of the sample.
magictype
string
A ‘magic type’ is better understood as a file type based on established file format types that allow files to be associated by the operating system and executed or loaded into memory.
threatScore
string
The score given to a particular sample based on the analysis performed by Threat Grid. A threatScore is a measure of the amount of system weakening, obfuscation, persistence, modification, data exfiltration, and other behaviors which may be a threat to the host system’s integrity. It is intended as an overall threat indicator that can be used as a guide to the likelihood that a submission is malicious. The Threat Score is not an authoritative classification of good and bad software.
size
integer
The size of the sample in bytes.
firstSeen
integer
The epoch timestamp for when this sample was first seen by Threat Grid.
lastSeen
integer
The epoch tim stamp for when this sample was last seen by Threat Grid.
visible
boolean
For internal Umbrella use only, please ignore.
avresults
string
AntiVirus results according to ClamAV. A sample can have more than one signature if it is possibly detected under more than one family of malware. A sample may also have no signatures associated.
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/sample/{hash}?limit=100&offset=10"
{ "sha256": "e9d3470c37dada28d5a32fb53a243c5b20def35bb01abf8f5403182cc2b91fdd", "sha1": "de182fdcc3c0d473b90a0df0ad14c2074d1e7c50", "md5": "282f80e8a2cf9e0e0dd72093787d99c6", "magicType": "PE32 executable (GUI) Intel 80386, for MS Windows", "threatScore": 100, "size": 192512, "firstSeen": 1460108539000, "lastSeen": 1460108539000, "visible": true, "avresults": [ { "signature": "Win.Trojan.Ramnit", "product": "ClamAV" }, { "signature": "Win.Trojan.Parite", "product": "ClamAV" } ],
NOTE:
The /artifacts/ section will only display for Threat Grid customers.
The next section of response is a nested JSON blob for artifacts. The /artifacts/ field under the /sample/ endpoint is split into two fields:
- 'artifacts' that hold artifacts without Threatscore
- 'samples' that hold artifacts with Threatscore
To paginate for additional artifacts, you can specify an extended limit if your totalResults were to exceed the limit by using this endpoint:
/sample/{hash}/artifacts/
Where {hash} is the MD5/SHA1/SHA256 of the sample.
Parameter for input for /artifacts/
limit
integer
Default to 10, but can be extended for a larger data set
offset
integer
Used to paginate between sets of data
To paginate for additional samples, you can specify an extended limit if your totalResults were to exceed the limit by using this endpoint:
/sample/{hash}/samples
Where {hash} is MD5/SHA1/SHA256 of the sample.
Parameter for input for /samples/{hash}/samples
limit
integer
Default to 10, but can be extended for a larger data set
offset
integer
Used to paginate between sets of data
Sample query for an artifact with a limit of 100
https://investigate.api.umbrella.com/sample/de182fdcc3c0d473b90a0df0ad14c2074d1e7c50/artifacts?limit=100
The results have been shortened here for ease of reading. Descriptions match those for /sample/
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/sample/{hash}/artifacts?limit=100"
"artifacts": { "totalResults": 10, "moreDataAvailable": true, "limit": 100, "offset": 0, "artifacts": [ { "sha256": "fd6c69c345f1e32924f0a5bb7393e191b393a78d58e2c6413b03ced7482f2320", "sha1": "b4fa74a6f4dab3a7ba702b6c8c129f889db32ca6", "md5": "ff5e1f27193ce51eec318714ef038bef", "magicType": "PE32 executable (GUI) Intel 80386, for MS Windows, UPX compressed", "size": 56320, "firstSeen": 1460108539000, "lastSeen": 1460108539000, "visible": false, "avresults": [] }, "samples": { "totalResults": 1, "moreDataAvailable": false, "limit": 2, "offset": 0, "samples": [ { "sha256":"9e55cc29f8edb91e6d86530986f08528bb20429e8dce0fec0cfb74f189054db2", "sha1": "ea020f1adf052edcea33b03318b0ecb99a640448", "md5": "ea0008159914b7f20e2f82db77528171", "magicType": "", "threatScore": 56, "size": 16, "firstSeen": 1456268452000, "lastSeen": 1456268452000, "avresults": [], "Behaviors": [] } ] },
/sample/ for connections
The next nested array is “Connections”, which are IP or domain ‘call outs’ associated with this sample. This gives you additional pivot points from within the sample that you can then associate in Investigate to determine if these are related to this malware’s activity or simply a domain or IP that is being used by the malware to determine connectivity (as in the ‘google.com’ result in the example below).
To paginate for additional connections, you can specify an extended limit if your totalResults were to exceed the limit by using this endpoint:
/sample/{hash}/connections/
Where {hash} is the MD5/SHA1/SHA256 of the sample.
Parameter for input for /sample/{hash}/connections
limit
integer
Default to 10, but can be extended for a larger data set
offset
integer
Used to paginate between sets of data
Sample query for connections of a hash with a limit of 100:
https://investigate.api.umbrella.com/sample/de182fdcc3c0d473b90a0df0ad14c2074d1e7c50/connections?limit=100
Name
string
The name of the connection, whether that's a domain or an IP address.
firstSeen
integer
The epoch timestamp for when this sample was first seen by Threat Grid.
lastSeen
integer
The epoch timestamp for when this sample was last seen by Threat Grid. The lastSeen and firstSeen will often be the same if the sample is more recent.
securityCategories
string
These categories are the ones that Umbrella assigns to a domain (malware, botnet, etc).
attacks
string
The name of attacks, if any, associated with this connection. An example might be a named botnet such as Kelihos.
type
string
Either HOST, if a domain, or IP if an IP.
ips
array
The list of the actual IP addresses associated with this connection, if any.
urls
array
The list of the domains or URLs associated with this connection, if any.
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/sample/{hash}/connections?limit=100"
"connections": { "totalResults": 2, "moreDataAvailable": true, "limit": 2, "offset": 0, "connections": [ { "name": "google.com", "firstSeen": 1456268452000, "lastSeen": 1456268452000, "securityCategories": [], "attacks": [], "threatTypes": [], "type": "HOST", "ips": [ "172.217.1.78" ], "urls": [ "http://goo.gl/PDIfV" ] }, { "name": "rtvwerjyuver.com", "firstSeen": 1456268452000, "lastSeen": 1456268452000, "securityCategories": [ "Botnet", "Malware" ], "attacks": [], "threatTypes": [], "type": "HOST", “ips”: [ ], “urls”: [ ] } ] },
Samples for /Behaviors
The next nested JSON array is ‘behaviors’, which are key traits and behaviors that have been identified as indicators of malicious activity. Individual ‘behaviors’ are broken down within the results from a single checksum. This section will match what you see in the user interface for “behavioral indicators,” and for users of Threat Grid, a list of possible indicators can be found here: https://panacea.threatgrid.com/indicators
All behaviors for a sample are listed and there should be no need to paginate for additional behaviors. All results are displayed with each query and dding a 'limit' to the result will not result in a syntax error, but will not be honored.
To isolate to only show behaviors of samples use this endpoint:
/sample/{hash}/behaviors/
Where {hash} is the MD5/SHA1/SHA256 of the sample.
Sample query for the behavior of a hash
https://investigate.api.umbrella.com/sample/de182fdcc3c0d473b90a0df0ad14c2074d1e7c50/behaviors/
name
string
The name of the behavior as defined by Threat Grid.
title
string
The formal title of the behavior (human readable) as defined by Threat Grid.
hits
integer
The number of times this behavior was seen in this sample.
confidence
integer
The confidence score indicates the likelihood of a sample exhibiting the stated behavior, ranging from 0 to 100.
severity
integer
The severity score indicates the likely security risk posed by a given behavior, ranging from 0 to 100.
tags
string
Tags associated with this particular behavior; these are provided by ThreatGrid and associated with the type of behavior.
threat
integer
A score measuring the relative threat of this behavior, ranging from 0 to 100. This score is an aggregate of the confidence and severity scores (it is confidence and severity scores multiplied together and divided by 100).
category
string
A list of categories of behaviors, as provided by Threat Grid. The categories include malware, network, and file. The most common category is simply "attribute". For existing Threat Grid customers, a full list can be found in the categories column here: https://panacea.threatgrid.com/indicators
curl --include \ --header "Authorization: Bearer %YourToken%" \ "https://investigate.api.umbrella.com/sample/{hash}/behaviors/"
"totalResults": 2, "moreDataAvailable": true, "limit": 2, "offset": 0, "behaviors": [ { "name": "pe-packed-upx", "title": "Executable Packed with UPX", "hits": 2, "confidence": 30, "severity": 30, "tags": [ "packer", "crypter", "encoding", "PE" ], "threat": 9, "category": [ "attribute" ] }, { "name": "pe-header-timestamp-null", "title": "PE COFF Header Timestamp is Not Set", "hits": 2, "confidence": 60, "severity": 5, "tags": [ "file", "attributes", "anomaly", "PE" ], "threat": 3, "category": [ "attribute" ] } ] }
Latest Malicious Domains for an IP < Threat Grid Integration (Cisco AMP Threat Grid)