There are several ways to pursue this investigation, but I want to focus on the idea of prevalence analysis here. Analysts use prevalence analysis to identify what proportion of a population shares one or more characteristics.
There are a few use cases for prevalence analysis… determining the disposition of an event, assessing impact, or sequencing events on a timeline. At this early stage, we’re concerned primarily with the disposition since we don’t yet know what it is.
If every system of a certain type of role is performing the same POST, then there may be a benign explanation related to that type or role. If this is the only host making the POST, we can’t lean so far in that direction on our dispositional scale.
Dispositional prevalence analysis means we’re looking for how many other places an event or characteristic manifests. For example, has that POST been sent by any other hosts on the network? Have any other hosts on the network communicated with that external IP?
The general maxim is that rare things are more likely to be malicious. If a whole segment of hosts do the same thing, it’s probably benign behavior. Of course, that’s not always true, and it becomes less true as compromise advances.
With prevalence analysis, we can focus the breadth of our search on events on the host itself (local prevalence), our network (network prevalence), or the whole world (global prevalence). Where we focus usually depends on the artifact type.
Consider how we might use these forms of prevalence...
- Local: Do fragments of the potentially encrypted string appear elsewhere on disk?
- Network: How about elsewhere in HTTP traffic on our network?
- Global: How about on other networks (internet or threat intel search)?
This investigation scenario is rich in opportunities for prevalence searches to assess disposition. The great thing about this analysis is that it also inherently helps you assess impact if you do confirm malicious activity.
My response of the week goes to @ido_gat on Twitter: https://x.com/ido_gat/status/1772652020639433165?s=20. They mentioned some great strategies like identifying the source process, reputation research on the destination, and prevalence!
Speaking of prevalence, what are some of the most common artifacts you would expect to perform prevalence analysis on? Are the mechanisms to do that readily available in your environment? That’s something to think about… 🚀 #InvPath #DFIR #SOC