Thread by @stvemillertime, A #dailyyara thread on collection of binaries by non-malicious (but threat dense) [...]

A #dailyyara thread on collection of binaries by non-malicious (but threat dense) equities: ELF SOCKS5 edition.

I'm an advocate for finding malware and intrusion sets based on "rare equities," files that have some toolmark that make it a bit more likely to be malware than not.

For example, I've mentioned that files with equities of "pcap" or PDB paths are threat dense and can help you collect or detect lots of boutique and high-end malware, even these toolmarks are not inherently malicious.

https://twitter.com/stvemillertime/status/1167554970020962304 https://twitter.com/stvemillertime/status/1265689130987651072

https://twitter.com/stvemillertime/status/1167554970020962304

When we identify equities that are interesting, we can attempt to collect and measure the number of both malware and non-malware files to identify and label the prevalence of a given thing. We often do this with rule logic such as Yara.

You might think of a rule as a "signature," and signatures have a bad rap for being simple, brittle and dumb. I think that's a bit reductive.

I think that signatures can be complex, elegant, clever and extraordinarily useful.

Signatures *are* code. Signatures *are* automation.

What _is_ a signature, anyhow? In my opinion, it is comprised of an intent, a data source, and a set of conditional logic, expressed within the structure and syntax of a specific rule technology. Sigma, Open IOC, Snort, Yara, Drools, ClamAV. Sigs may manifest in many forms.

Sigs are automations! Analysts, researchers and engineers can create signatures to *store* anomalies and analytical conclusions. They represent an automation because they can be used by automated systems to smash against data in motion and at rest. Sigs take human ideas to scale.

Sigs (or signals) have multiple functions when matching against data. A weak signal can _collect_ events/objects/data that might be otherwise uncaptured. A stronger signal could build a haystack of anomalous activity for hunting. These stacks come in diff sizes and purposes.

What makes an ELF file more likely to be malware? And how do we express this in a signature? There are a thousand ways, but let's start with "rare equities." I think that there will not be many ELFs with "equities" for SOCKS5 capabilities.

Rare Equity: ELF with SOCKS5

An "equity" can be present in an ELF in many forms. Perhaps it is a string toolmark, perhaps a byte sequence, or symbolic function. But you can smash any of those in a #dailyyara rule and begin to survey your data for prevalence and threat density.

Here's a simple #dailyyara rule that will match on any ELF file with the string "socks5".

https://gist.github.com/stvemillertime/9938dec3893f33dec56ea78fc1b97dcb

This might seem dumb at first, but let's look at the measurements. Just how much malware has this equity? And which actor developers did that?

In a quick survey of ELFs with SOCKS5 equity, I found about 2500 files, of which I knew that over 250 were already labelled malware in our threat intel graph.

ELFs with SOCKS5 equity by malware family, include, but are not limited to:

mindgames
sageenvoy
vpnfilter
pinking
partybus
coldcall
sshspy
goblinbrute
shellgame
systembc
highnoon
mindgames.ssh
redflare.gorat
earthworm
godlua
merlin
sevenminus
gobrute
barfly
redsonja
sogu

& more

ELFs with SOCKS5 equity by threat actor/group/cluster include

fin1
fin6
apt41
unc1100
unc1454
unc1196
unc1633
unc669
unc1857
unc1778
unc1853
unc2506
unc2507
unc609
unc1934
unc946
unc304
unc2237
unc944
unc828
unc2097
unc2330
unc1844
unc2338
unc832
unc961
unc1882
unc1559

& more

This rare equity for ELF with SOCKS5 is associated with many espionage and financial actors, from multiple geos, over several years, on many affected sectors. The developers know that SOCKS5 is not malicious, but they are not aware that the presence thereof is notable/trackable.

When we find equities that we believe are rare enough to be useful in this way, we develop several rules of varying fidelities to help make the matching haystacks more useful. One will be basic, one with some filesize filters, one will be ELF executables, one ELF shared objects.

We love it when vendors call their next-gen products "engines" that are secretly just yarnballs of sig and scripts, so we call our rulepack looking for rare equities "AscensionEngine." We group sigs w/ similar intents together into an *Engine for easier monitoring/maintenance.

We've got a lot of *Engines, looking for a lot of things. The sigs within these are hedges to capture data, label and provide context, build haystacks, create alerts, allow us to chase evil based on _how_ intrusions happen in a way that transcends a single malware or threat actor

If @bryceabdo is chasing an UNC with a new piece of ELF malware with SOCKS5 capability, he can immediately pivot to a huge haystack of similar data in both our threat intel graph and in our products.

If @BarryV returns to his hallowed quest of stomping out FIN7, he has a plethora of ConventionEngine and ExportEngine rules that commonly describe FIN7 development work, and his own CertEngine rules for X509 anomalies, that could help chase down the group across all data we have.

I firmly believe in studying raw data from intrusions and expressing what I learn into formats that last across old and new data alike. I think formats like Yara are some of the best ways to express threat understanding in a way that allows us to collect/label data at scale.

AscensionEngine #dailyyara looking for rare equities in ELFs and PEs of "pcap" https://gist.github.com/stvemillertime/200295215ef2270323508a2a683554e2

RareEquities_Pcap.yar

GitHub Gist: instantly share code, notes, and snippets.

https://gist.github.com/stvemillertime/200295215ef2270323508a2a683554e2

ExportEngine #dailyyara rule looking for anomalies or odd toolmarks in export DLL string name https://gist.github.com/stvemillertime/6abaab1146c9b71e486c24113cd47304

ExportEngine_xArch.yar

GitHub Gist: instantly share code, notes, and snippets.

https://gist.github.com/stvemillertime/6abaab1146c9b71e486c24113cd47304

XOREngine #dailyyara rule looking for http:// URI string xored in a PE https://gist.github.com/stvemillertime/dcaa5435f70cd6e7db0d945db62994da

XOREngine_HTTP.yar

GitHub Gist: instantly share code, notes, and snippets.

https://gist.github.com/stvemillertime/dcaa5435f70cd6e7db0d945db62994da

AscensionEngine #dailyyara looking for the rare equity of KCP transport library in a PE https://gist.github.com/stvemillertime/11d431d244a67c31f4b50e648dd0d9f8

Methodology_AscensionEngine_KCP_Strings.yar

GitHub Gist: instantly share code, notes, and snippets.

https://gist.github.com/stvemillertime/11d431d244a67c31f4b50e648dd0d9f8

If you made it this far, thanks for coming to my coffee rant! Keep calm and write Yara rules.

Latest Threads Unrolled: