Tag Archives: OpenIOC

Thoughts on STIX

Carrot Bomb

Threats should be properly understood.

At Black Hat, I saw a presentation by Sean Barnum from MITRE on Structured Threat Information eXpression (STIX). This builds on their work with Cyber Observable eXpression (CybOX), another standard that more or less exists at the same level in the “stack” as OpenIOC. I don’t think I can say much right now about the differences and possible advantages of CybOX or OpenIOC over the other, mostly because I don’t think I understand them quite to that level of detail.

But STIX tries to organize that IOC / observable information and give it some context. For example, using STIX/CybOX, you could collect together a set of observables, link them together into an indicator, and then associate that indicator with known TTPs. That TTP would possibly link to other indicators to help with confirmation and attribution to a threat actor or campaign.

I’d like at some point soon to build up a “link database”, something like a triplestore (graph database) that represents all its data in a Subject-Predicate-Object form.

$hash BELONGSTO $malware
$twitterhandle BELONGSTO $person
$ipaddress ISA $sshscanner

(This probably doesn’t look anything like the eventual ontology, and my syntax may or may not fit the RDF standard, but it should illustrate the idea.)

As we build up these links, now we start to have a database of objects (possibly using RDF or similar) that link together in various ways, allowing you to perform appropriate analyses to identify clusters of knowledge. This alternate structure probably isn’t usable for STIX per se, but that doesn’t mean a STIX document couldn’t be an object. For example, if we have a STIX document explaining the details of a piece of malware, you could represent that as

$malware ISDESCRIBEDBY $stixdocument

And then once we get to that $malware node in your graph DB, now we know where to go for the full low-down. This would work the same if we have a document (dossier) describing a particular person.

To visualize it, think of something like a Maltego or Casefile graph, only much larger than you can reasonably manipulate with that tool. Which is not a criticism; I love Maltego and use it almost every day, but it is a tool for a particular set of use cases. Ideally, you could export a subgraph to a Maltego file, or of course the inverse: use Maltego to do a bunch of research on something, then import all that data into your DB for archival and possible linking to other nuggets of information.

I’ve done some very exploratory work on this, mostly just prototype code related to working with Maltego transforms. Of course my involvement with CIF also has provided some great lessons of all sorts. But I think that, as a question of knowledge management, we can do better than just tracking very basic data (like we do with CIF) or spending a lot of time on enumeration and detailed XSDs. Both of those matter a great deal, and I work on them frequently in my day job, but they just scratch the surface of what we can do with modern data and knowledge management techniques.

Sharing threat intelligence

Sharing threat intelligence in another context

I really enjoyed reading the recent article on sharing threat intelligence by Conrad Constantine of Alienvault. He clearly has spent a lot of time thinking about this issue and working with it, and he has some really fascinating perspectives. While I discuss the article below, you should read it first for context (and because it has additional thoughts not explored here).

As he notes, this controversy has (at least) two sides, and I lean toward the first. We should continue to explore how to share threat intelligence, though we have a lot to figure out. As a side note: OSVDB, by definition, does not contain threat intel but vulnerability intel. The other two sites listed (Malware Domain List and Shadowserver) do provide good examples. Organizations will always have to “construct their own processes and technology to consume it effectively”, particularly processes. As an example, most practitioners would agree that current-generation antivirus / antimalware has a terrible intelligence process, driving us to look to entirely different technologies.

Intelligence sharing within an industry vertical works okay. My experience primarily centers around the Financial Services ISAC, which has some fairly basic but useful methods for data sharing. Even then, many organizations often find themselves mired in fear of their own staff. And not for nothing, but classification affects more than just the defense industrial base. We live in an environment in which all critical industries find themselves semi-nationalized, and thus national loyalty matters even for corporate and economic matters. I’ll leave discussion of whether that makes sense out of this post, but the fact remains that industries far removed from “national security” use this process as well.

“The progress we haven’t made in security data sharing isn’t because of limitations in technology or legal implications (both of which can be overcome with little effort). People don’t want to share because of those old faithful standbys still gnawing at the human mind: fear and greed. Fear of how whatever we share may be used against us, greed for anything we can get for free, or better yet, monetize the transaction.” (emphasis original)

The article also seems to imply some problems with commercializing this segment. In my view, vendors may charge, but they (should) also provide important components like validation & data cleaning so that those consuming the intelligence can trust it and put it directly into use. He prefers that people rely on ‘enlightened self interest’, but this looks like an extension of the classic prisoner’s dilemma (diner’s dilemma). We should look at existing thought surrounding concepts like the tragedy of the commons to find possible solutions here. Clearly, in any workable arrangement, you must receive more than you give – but I suggest that you must also give in order to receive, or the whole arrangement will fall apart. Economics and game theory have lots of history and application that can help us here.

Some data doesn’t require too much anonymization. In general, higher-level threat intelligence presents less operational risk than lower-level data. Statistical incident summaries or even data using tools like the VERIS Framework will rarely if ever endanger organizational security. This becomes more true when aggregated by a trusted third party or held until investigators have completed the confidential portions of their work. We can delve into greater detail with tools like IODEF or even OpenIOC, but these may necessitate more scrubbing in some cases to protect against disclosing internal confidential data.

The real issue comes when an ongoing investigation becomes compromised due to sharing this intelligence. The article gives the example of an attacker who uses a host only to attack a specific target and then realizes he’s been “made” once that host appears in a threat intel feed. Similar issues have arisen in the past with malware samples uploaded to AV vendors or even VirusTotal. This directly drives the concerns many organizations have with sharing information. In my view, the risk depends on the type of attacker, or at least his sophistication level, and whether you suspect a targeted attack.

Perhaps it would help organizations to determine in advance how far out to share certain types of data (and at what point in an investigation). If you’ve established good rapport with your counterparts in organizations much like yours (e.g. competitors), then you can start by doing that fairly early in an investigation. Once you’ve disclosed an incident, you can disseminate more widely. Automated sharing should only occur in the case of automatically-detected attacks (e.g. “IP addresses attempting web application exploits caught by WAF”). While OPSEC considerations still apply here, keep in mind that you should still receive more useful intel than you share. And don’t forget that attackers frequently share target data, including in underground markets.

We need data, partly to create better models and partly to help speed up our own OODA loops. The status quo only helps attackers, and the ostrich approach will not do anything to secure our networks.

For background on the image, see Badger Guns on Wikipedia and the reference links, or search for “Badger Guns Milwaukee racism”.

MIR training class

"School" by Jim Potter

Last week, I took the MIR class from Mandiant. Primarily consisting of product training (as expected and desired), this turned out to be one of the better vendor classes I’ve taken in my career. While I’ve used MIR for close to six months now (and its free predecessor for considerably longer), I still got plenty out of it.

The class runs four full days and starts off with the expected topics like installation, deployment, using the admittedly difficult UI, and related tasks. From there, we delved into responding simulated intrusions. While I learned a few investigative tips, in general this mostly highlighted the platform’s strengths. The class also briefly covered counter-forensics and malware analysis, but at a very high level[1]. The art of writing IOCs and sweeping your enterprise took an entire day and included lots of detail and practice.

I appreciated the instructors’ background: professional IR types with good teaching skills rather than career trainers who pretend to know something about what we really do every day. Slide reading just didn’t occur, and the hands-on exercises take up at least half of the class.

More than anything else, I liked the collection of students in the class. We had about eight “outside students” and four to six Mandiant employees on any given day. But unlike some classes that never engage during the “lecture” portions and go their own way during breaks and lunch, we had lots of great back-and-forth during class, informative lunches, and I like to think that I made several solid professional connections that week.

A few things could improve, some of which have more to do with the product than the course. The room felt a little cramped, for example, and we probably could have used even more time dedicated to searching, filtering, and writing IOCs.

In general, I found the class really valuable and will send more of our staff to the class in 2012. Mandiant doesn’t like it when we talk about when they might offer the class again, so keep an eye on their Twitter feed and web site if this seems like something you could use.

1: I have taken the Black Hat edition of their malware analysis crash course and it’s worthwhile for responders who need to understand the basics and have some background.

Overview of incident and threat reporting standards

"..." by Pom²I’ve spent a lot of time looking into standards for sharing information about incidents as well as detailed threat data lately. As it turns out (and as one would expect), lots of smart people have built some useful tools for sharing this information. So I thought I’d talk a little about what I’ve found and how various standards can work together in a stack.

Lately, the new OpenIOC standard has gotten some discussion. This is an XML schema that one can use to describe specific threat signatures: MD5 hashes, mutexes, registry keys, and the like. If an organization wants to share information categorizing a particular piece of malware, say, or other ways to identify a system that has been compromised by a particular threat, then IOC does that well. It’s the sort of thing that ThreatExpert could use to provide signatures for the malware it analyzes, or an investigator could use to describe artifacts left by a particular attack. I don’t know of other standards that hit this particular pain point, though I’d love for someone to point them out to me.

Now some of us have asked how this compares to IODEF, an IETF standard that describes an entire incident. CIRTs could exchange IODEF information about a particular attack: attacker identities, targeted assets, vulnerabilities and exploits, impact on the affected assets, contact information, etc. In fact, I believe that IOC could fit into IODEF to describe the indicators that can characterize a particular incident, but IODEF includes much more. To use a networking analogy, IOC is to IODEF as HTTP is to TCP. Or to take a law-enforcement approach, IODEF represents the police report for an incident and IOC represents the fingerprints found on the scene.

For those familiar with VERIS, an information-sharing framework originally developed by Verizon. Unlike the other two standards, however, VERIS tries to organize the data into high-level metrics: demographics of the victim (e.g. organization type, industry, staff size), A4 incident classification (agent, action, asset, attribute), and that sort of thing. This doesn’t yield actionable intelligence, but it does help us analyze trends in the overall threat landscape. To carry on the previous analogies, VERIS corresponds more to traffic flow statistics or to the FBI Uniform Crime Reports.

All of these standards, and others like them, have a role to play in helping defenders share useful information and collaborate appropriately. In a future post, I’ll talk about some relevant tools that use these standards.

Threat intel sharing with OpenIOC

Indicator of Compromise by Kool-Aid Man

Mandiant recently announced OpenIOC, “an extensible XML schema that enables you to describe the technical characteristics that identify a known threat, an attacker’s methodology, or other evidence of compromise.” For example, you might have an IOC listing something as simple as a set of MD5 hashes and file names, or as complex as descriptors of the structure of a particular executable (PE file. The schema includes terms for network indicators as well, like URIs, IP addresses, and strings in network traffic.

Those of us who react to threats every day already know we need to get better at sharing threat intel and acting on it quickly. A number of industry and other organizations exist that help get these data out to folks who can use it, but often the intel comes in the form of a human-written. This means that systems can’t parse the data easily, and in fact the communication sometimes has significant ambiguity on it. When systems and tools can’t parse the data, not only does that introduce delays into the detection process, it also makes validation difficult. So sometimes we get notified of malware with the MD5 sum “d41d8cd98f00b204e9800998ecf8427e” (the hash of the zero or null string), or of “http://google.com?webhp&hl=en”. Both of these have happened to me in the last few months, and while that’s simple human error, allowing tools to do some basic sanity checks would help with this.

This, of course, shows up the weakness with OpenIOC: a classic chicken and egg problem. The XML files don’t serve much purpose until tools can read them, but at the moment the only tools that can read them come from Mandiant: their enterprise commercial product MIR and the free no-cost IOC Finder. (Note that, while OpenIOC is released under the Apache 2 license and therefore qualifies as ‘free software‘, the same does not hold true for IOC Finder.)

For OpenIOC to work well, we need more tools and responders to support it. That could start with truly free tools like Splunk, Sleuthkit, and Snort, but I’d like to see large commercial tools like Arcsight, EnCase, and Sourcefire incorporate it as well. This applies as much to producing IOCs as it does to consuming them, by the way: if FireEye’s malware detection and analysis tools could export an IOC, detection across the network would become much more straightforward. But Mandiant, as much as I love many of the people who work there, has sort of a NIH problem: they like to blaze new trails and do cool new stuff, but working with other vendors has always seemed to stymie them as far as I can tell. Hopefully Doug Wilson, the new point man on OpenIOC, can turn that around.

OpenIOC can solve a key problem, but we will see whether anybody actually uses it to do so.