Log analysis has always struck me as one of those things that gets too much superficial attention without enough attention to detail. That is, we know that we need to do it, but we don’t talk about how we need to do it. At best, we talk about making sure we collect and archive logs. Analysis plays second fiddle, even though in reality logs without analysis provide almost no value to an organization. And you’ll find greatest value in discovery of the earliest stages of an incident rather than in hindsight to understand what went wrong. Unfortunately, less than 1% of data breach investigations in the 2011 Verizon DBIR started with log analysis and review!
The analysis ideas I present below don’t even begin to represent a comprehensive view. And of course every network is different, so you will need to think about your specific needs. But this may get you thinking in directions you hadn’t previously considered. Side benefits include analysts becoming more proficient with their tools, pushing the limits and gaps in their toolset, creating baselines of their environment, and even mentoring via shared hunting trips. These could serve as foundations for SIEM use cases, but here we’re talking about active exploratory usage by an analyst.
Hunting trips in DFIR involve actively looking for possible anomalies or indications of compromise on your network. Even if you don’t find anomalies, you’ll get a better understanding of your baselines. In this post, I’ll talk about hunting through your network traffic logs. Richard Bejtlich talks about hunting through systems as well, but I’ll save that particular discussion for another day. Further, if you do this by having a junior analyst “tag along” with a more experienced analyst (e.g. via screen sharing and chatting), you get the regular benefits of good analysis plus team-building and training.
First, and most importantly, always keep in mind that we’re only identifying anomalies, not automatically classifying “bad” traffic. Nothing here can positively and without question find evil with no false positives or false negatives. It should, however, increase your efficiency in finding things that violate your policies or possibly indicate a compromise.
Compromised systems may start sending out traffic that doesn’t look like the rest of your traffic. Perhaps an attacker is trying to exfiltrate data, or a bot may simply try to contact its C&C infrastructure. So look carefully at outbound traffic logs from your perimeter firewalls. Good protocol candidates include SSH, SMTP, and IRC (yes, even now in 2011). In fact, examine all non-HTTP traffic from user subnets with suspicion.
Also look for protocol-port mismatches. Do you have HTTP traffic on high ports, or maybe even something like SSH on TCP 80? Attackers often like to overload TCP 80 to slip through loosely secured perimeter networks.
Web traffic has some unique problems. Not only does it involve a constantly changing set of endpoints, protocol evolution means that HTTP isn’t really the top-level protocol in the stack anymore. Development has rapidly left behind simple GETs and PUTs, and things like WebSockets overload ports beyond what you may realize. Still, try to analyze this traffic because so much malicious activity uses this channel.
For outbound surfing, look at your User-Agent strings: lots of spyware browser extensions will show up here. Some malware tries (poorly) to look like regular browsers and you can sometimes find it through misspellings or anomalies like default languages. A good proxy may do this, but mining the data yourself can find new threats. Look at the domains that users hit as well. Check URLs against external APIs but beware. If you get the chance and it fits your network or organization, look at destination geolocation. You may identify suspicious traffic by its destination country – if you sell widgets to farmers in Iowa, then outbound traffic to Eastern Europe or the Asia-Pac region is worth a second look. For both of these areas, applying the principle of Least Frequency of Occurrence can greatly reduce the dataset you actually need to review.
Inbound traffic to your web servers should get a close look too, using similar analysis methods as we discussed for outbound web traffic. However, take a close look at your URI query strings to find people attempting SQL injection or other forms of attack (hint: look for really long payloads). You may wish to review user agents here as well, though your mileage may vary if you run a popular web site or one with lots of global exposure. This will have particular effectiveness when analyzing traffic to API servers.
Consider looking at source geolocation as well, though as before, don’t fall into traps. In some organizations, working with your marketing or web analytics team can help you understand things and clarify your assumptions here.
The effectiveness of this part of the review may vary according to your threat model and overall security posture. For example, if you don’t have a good application security program, or if you have few users on your network, this area will matter more than egress traffic. Conversely, if you have very few exposed services, this may not deserve as much effort.
Create some network flow baselines. You can’t know what’s anomalous until you know what’s normal. A word of caution here, though: don’t assume your baselines are already secure. You might have an existing but previously-unknown compromise. So spend time with your system administrators to identify traffic flows that don’t have an immediately obvious purpose.
What does traffic in and out of your desktop networks look like? These will necessarily differ significantly from your server networks, which need the same sort of attention. What systems usually talk to each other? Do they contact a particular set of authorized external hosts (e.g. for updates and such), especially with a defined frequency? What’s the traffic distribution across various ports? Does this vary with time of day, or day of the week?
You’ll start to build a framework of known good traffic to exclude from future analyses. As the US military teaches, the more you sweat in preparation, the less you may bleed in battle.
Log management matters, but log analysis matters more. Even if you have a relatively limited dataset available, start with what you have. Like tugging on the proverbial sweater thread, you will find that a little effort at the beginning can quickly unravel more than you initially might have guessed.
In the future, I’ll talk about hunting trips through your systems and other types of security data. But at any time, I welcome your thoughts and suggestions!