The infosec community this week has buzzed with news of an embarrassing non-incident at an Illinois water plant. You can get the whole story at Threat Level, but the gist goes something like this: a water treatment plant experienced a failure in one of its pumps. The staff initially treated it as another common mechanical failure, but at some point someone saw logs that indicated a remote login from a Russian IP address and decided that one event caused the other. Subsequent investigation by a government terrorism and intelligence fusion center revealed that the two events had no relation and we can all flush our toilets relatively free of anxiety. The story has much more detail, all of which will cause moments of extreme facepalm.
Others have addressed the ongoing questions around SCADA security and vendor FUD. But I want to discuss some common failures in threat analysis and incident response that this particular case has highlighted.
Someone at some point jumped to the conclusion that a security incident had occurred. Before I even read the article, the precise phrasing “out of an abundance of caution” seemed like the cliché of the moment, and sure enough, a water district trustee had used those precise words. Apparently, the sight of a .ru address led to fantasies of a “digital Red Dawn” scenario and thus escalation to the intelligence and counter-terrorism community. But equipment failures occur much more frequently than SCADA intrusions by any measure – by many orders of magnitude. As doctors know, when you hear hooves, you should look for a horse, not a zebra.
Additionally, no one validated the login with the user.In this case, a contractor had logged in from an unusual location, and investigating that could make sense for a small water treatment plant in the Midwestern United States. But an anomalous event may have a good explanation; a quick phone call or email to the user would have straightened this out quickly.
No other corroborating evidence appears to have existed. If an attacker had logged in with stolen credentials five months earlier and somehow caused one pump to fail many months later, additional artifacts should have existed: perhaps some exploratory probing inside the network, or other logins, or (one might think) greater damage. In other words: if an investigator has a hypothesis to explain one data point, then he should seek other data that could confirm it. The lack of that data should cause him to re-examine his hypothesis. In particular, the timeline alone should have given pause based on critical thinking (the core of good troubleshooting).
I have no particular knowledge of this specific incident. Therefore, while I will talk about possible root causes for the analysis failure, I can only base it on my experience in similar situations across a number of organizations.
- Undertrained analysts: I don’t mean the tech who originally noted the address, although as noted above, he should have thought about the timeline. But the analysts at the fusion center clearly lacked the training, judgment, and experience for even this simple scenario.
- Poor validation workflow: Once the center received the report and, I assume, a first-level analyst looked at it, more senior analysts should have validated it. Either those senior analysts never saw the analysis before it got out to the public, or they have even less qualification for their roles than front-line analysts.
- Institutional culture: In many such centers (whether private or public sector), the culture rewards analysts who find “things”. After all, if the center receives enough data, then many organizations will assume the data contain lots of evil. This can happen when the management does not construct their metrics with care, for example. But human nature also plays a part: finding evil is fun and sexy; finding banality is not.
I’d draw two core lessons here. First, train analysts in analysis, not just technology. Many organizations focus on spending huge amounts on systems and possibly data feeds, trusting in “smart people” who understand the tech to know how to analyze it. Certainly, analysts must have domain-specific technical qualifications, but the mental toolbox matters every bit as much. As an example, one of the best texts ever for analysts of any stripe is Turning Numbers into Knowledge by Dr. Jonathan G. Koomey. The book doesn’t focus on statistics or any particular methodology. Instead, it discusses the mindset of an analyst and the sorts of thinking required to do this successfully. Organizations need to focus on this kind of training to help analysts sift through data to find useful information.
Second, government doesn’t have special powers to find and understand threats. While clearly this problem has existed for a long time, it bears repeating at a time when the drive to suck up greater and greater amounts of data for mining and analysis threatens to infringe on core civil liberties. Seeking out evil, whether online or on the streets, doesn’t (necessarily) mean lots of data.
It means getting relevant data and knowing what to do with it. And clearly, we still have a long way to go.