Richard Stiennon recently wrote a post titled If you feel you need big data for security, you are doing something wrong. It’s worth reading, and I agree with his recommendation of implementing threat intelligence data and techniques. But his key thesis lacks quite a bit of support, largely due to analyzing technologies in isolation and criticizing them for not providing a complete security package by themselves.
First, after reviewing the (very real) issues with managing intrusion detection systems (IDS), he states that he “declared IDS dead as a functioning security solution.” A lot of practitioners – including this one – would disagree strongly. While IDS can certainly generate far too many signatures and alerts, good management practice winnows this down fairly quickly. For example, alerting on every teardrop attack in 2012 is totally useless cruft. No MSSP or clueful analyst actually ever needs to investigate millions or billions of alerts a day. Instead, you apply good analysis techniques and find the ones worth your time. I’ll grant that the concept of network security monitoring, including full packet capture, provides far more value, but even then the security infrastructure should include a properly-managed IDS as one component.
He goes from there to a very closely connected idea: using a security information & event management system (SIEM) to manage and correlate these data with other logs and asset models (vulnerability data, for example). Stiennon criticizes this technology because “these solutions addressed the data overload issue but did little to address security. They failed to curtail the rise of targeted attacks that are now wreaking havoc upon businesses and critical infrastructure operators.” This argument has two fundamental problems. First, SIEMs provide analytic and detective capabilities. While an organization’s processes can take those and feed the data back into preventive controls (e.g. firewalls), their primary purpose is to understand what is happening in your environment. Second, SIEMs provide critical abilities to investigate the attacks he cites. Log analysis has not itself detected nearly enough attacks for a number of reasons (in my view, largely due to poorly-trained analysts). However, gathering logs from literally hundreds of different sources after a compromise has come to light via some other method (e.g. third-party notification) creates a huge roadblock in the investigation. If you assume that you are already compromised and will be again, then you need a SIEM to close the loop as quickly as possible.
Stiennon finishes up his criticisms by attacking the idea of Big Data. Applying these techniques in infosec represents an evolution in SIEM usage, as opposed to a revolution. If you already collect full packet captures, all system logs, event logs from every device in your network – including the IDS – then you’ve rapidly entered the world of Big Data:
Data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures.
We still have a lot of work to do in order to understand how to analyze this data set, but we improve significantly every day due to the hard work of a lot of very smart people applying themselves to the problem.
So when Stiennon recommends security intelligence, he’s right to do so. I completely agree that organizations should bring in threat data on attack sources and understand their adversaries in greater detail. But you need to do something with that intelligence. In large part, that means correlating it with your existing data to find the attacks in your environment so that you can contain, investigate, and eradicate the intrusion. Of course, you should never assume that threat intelligence is the final piece of the puzzle, either, because that will lead to the same problems as the ones Stiennon identifies with IDS, SIEM, and even Big Data tech.
Security intelligence matters, and I’m personally committed to expanding our abilities to gather, analyze, and react to these data. But it works in concert with more fundamental systems; it doesn’t replace them.