Introduction to the Collective Intelligence Framework

CIRTs and related organizations often handle incident detection as well as response. Both of these roles produce and consume threat intelligence in different ways. For example, we often want to correlate our network traffic with OSINT indicators (known bad IP addresses and URLs, MD5 hashes of suspicious files, etc.) I’ve started looking at the Collective Intelligence Framework as a way to fulfill these needs. CIF development is sponsored by the REN-ISAC and National Science Foundation, with most of the coding (and everything else!) handled by Wes Young. Everything is open source for those of us who like – or need – to hack directly on the code.

In this article, I’ll explain CIF, give some usage examples, and discuss test deployment scenarios.

Understanding CIF

From the perspective of a user, CIF allows you to run queries against many data sources at once. If you have other private data sources available, particularly via XML (RSS), JSON, or in a file (e.g. CSV), you can incorporate those, as well as additional OSINT sources. CIF comes preconfigured for:

Use cases include manually querying the database for specific indicators (e.g. “do we have any records for this IP address?”) as well as pulling feeds of various sorts for use by security systems (e.g. “what URLs should we block at the proxy?”). CIF includes concepts of severity and confidence as well as privilege. This allows you to provide feeds of high-confidence public data to some systems while still allowing investigators to query private, unconfirmed data.

Essentially, CIF ingests data – typically on an hourly or data basis, depending on the source – indexes it on the fly for performance reasons, performs correlation analytics (e.g. so that a URL also turns into domain and IP address information), and then makes it available in feeds via various output plugins. These plugins include tables and HTML for viewing by a user, but also IPtables rules, Snort rules, JSON, and CSV for processing by other security systems.

Usage examples

Everything below comes from the Perl client. I haven’t yet dealt with the Python client, much less hacked on it, but that’s coming Soontm.

cif -q infrastructure/malware -c 50 -s medium


gives a fairly large list of IP addresses associated with malware. (I used medium severity and 50% confidence in these examples.)

Even if you don’t use a proxy server, you might find CIF useful for checking suspicious URLs:

cif -q url -c 50 -s medium -p snort


You now have a list of Snort rules to pull into your IDS.

Or if you have your own list of IP addresses to check, such as when an ongoing case has new indicators:

you can put them in a file and query each of them.

for f in cat hostlist.txt ; do cif -q \$f >> specific-ip.txt; done


This yields another list. You might see a few lines in that example with a “private” restriction and impact as “search”. This happens because, by default, CIF will log every query for a specific indicator. A number of searches, such as from other investigators, may have significance apart from any data. However, if you don’t want CIF to log a query, just use the “-n” parameter.

If you’d like to play with it some more, contact me for an API key and the address of my semi-public CIF server. Twitter or email both work fine.

Appendix: CIF on the Amazon cloud

Amazon Web Services provide a decent platform for testing CIF or running a public instance like mine. The following assumes some familiarity with Linux administration and at least a basic understanding of the Elastic Compute Cloud (EC2).

You can start with a small instance for the installation, but you’ll quickly want to move to a medium instance at least. I run a large instance using the Ubuntu Cloud Guest server image. In general, follow the server install instructions for CIF. You’ll also want to note the specifics for Ubuntu as they contain a few workarounds you will need. Allocate an Elastic IP and register it in DNS someplace, such as with Amazon Route 53. For the Security Group, only add HTTPS and SSH. You won’t need anything else, and I recommend leaving it at this minimal state for security purposes. You’ll also need an Elastic Block Store. While you can start with 10GB, expect that to grow a few GB per week, so you’ll need to resize from time to time or create a larger volume at the beginning. While not required for CIF installation, I can’t recommend enough that you use git to manage config files. Srsly.

When installing Postgres, note that “peer” may appear in the original file instead of “ident sameuser”. Also, I did not use the values in CIF doc, as postgres didn’t like them. I left everything at the defaults except:

work_mem = 512MB
checkpoint_segments = 32


When setting up BIND9, first check /etc/resolv.conf for the IP addresses you should use as forwarders.

NAISG DFW talk: Evolution of an IRT

Last Tuesday, I gave a talk at the DFW chapter of NAISG on “Evolution of an IRT”. Apparently I disappointed the organizers, as my talk didn’t actually have anything to do with Ice Road Truckers.

Caught in a fleeting "hands-in-my-pockets" moment by Joseph Sokoly

Note that I presented how I would build an IRT now, not necessarily how I did it last time. I’d do some things the same, but over the last 2.5 years I’ve learned a lot that would change how I’d do it in the future.

While the slides are available, they don’t really work outside of the context of a live presentation: mostly funny Internet pictures to illustrate a point and keep the audience slightly entertained. The outline will make much more sense, I hope. Really, I work from this first, and then riff on it based on what seems to get a reaction and elicit questions, which I happily accept throughout the talk. I don’t think we have a recording, but perhaps I’ll get someone to record a future version of the talk or even do a web-focused one.

DFIR Learning Curve

The CIRT gets a call from a concerned sysadmin who sees some ssh connections from an Eastern European country to a DMZ web server. As the investigation kicks off and the CIRT staff starts asking questions, they want to get up to speed as quickly as possible on any background they don’t already have. What does the server do? Does the sysadmin or someone else have historical logs? What network controls already exist? From there, they’ll start piecing together a timeline, finding anomalies, and generally trying to get as complete an accounting of the incident as possible.

At its core, incident response is a learning process: the responders need to learn as much as possible, starting with “known unknowns” and right into the “unknown unknowns”. And in successive incidents, the team will want to speed up that process. I put together a (naïve) diagram showing what we should attempt to achieve over time:

“Steep learning curves” really are ideal in many situations, including this one. We want to climb that line as quickly as possible. Once we pass a threshold, we can begin to contain and eradicate the intrusion. This also helps us provide the appropriate information to the organization’s leadership for the larger questions of response and future changes. But when we take longer than anticipated, or even fall behind an evolving incident – remember, the enemy has a vote – then the gap between the curves starts to incur additional costs to the organization.

Curve 2 deliberately shows a slower start than Curve 1. As we start the process of improving our tool set, workflow, and controls, a few initial stumbles will occur. Maybe you didn’t fully account for some of the deployment complexity, or perhaps the incident occurs in an area of the organization that has minimal instrumentation and management. (In fact, this latter scenario occurs with great frequency for obvious reasons.) But over time, we keep pushing that curve left, getting faster with each iteration. As we do that, we can reduce the impact to the organization, perhaps even moving further back in the kill chain.

I really like this model, but it needs evolution. What’s missing from it?

Analysis of DNI annual Worldwide Threat Assessment

The US Director of National Intelligence, James Clapper, provided his annual Worldwide Threat Assessment to the Senate yesterday (followed by a classified session with, we can surmise, greater detail).

The unclassified portion discusses cybersecurity several times. In fact, the introduction states:

Counterterrorism, counterproliferation, cybersecurity, and counterintelligence are at the immediate forefront of our security concerns.

Notwithstanding the idea that we should consider cybersecurity as a domain and not only a specific activity, I found it useful to see where the policymakers within the US intelligence community see specific concerns. The entire document runs about thirty pages, but over two-thirds of it addresses specific region-by-region and country-by-country concerns. Two pages cover cyber threats and counterintelligence, which for our purposes cover largely similar ground.

The assessment correctly notes that “neither the public nor private sector has been successful at fully implementing best practices.” I’d go a step further, because best practices evolve on both the attack and defense fronts. We don’t even fully implement standard practices: the things we know how to do efficiently and relatively easily. Standard practices, in my mind, constitute a reasonable bar to clear: if practitioners in a given area generally all accept some technology or process as “the way it’s done”, then we shouldn’t excuse anyone doing less than that.

Interestingly, the document first singles out China and Russia as state actors, but then refers to the 2011 NCIX report to specifically blame “entities within these countries”. This means that, although the DNI does not provide specific reasons for attribution in the unclassified report, he does claim that the entities have state sponsorship. The NCIX only said on page 5 of his report that the intelligence community has “not been able to attribute many of these private sector data breaches to a state sponsor.”

The DNI report also notes that governments cannot keep up with tech development and illustrates this by “failed efforts at censoring social media” in the Arab Spring. This should provide an object lesson to US policymakers, though the recent controversies over SOPA, PIPA, and now ACTA indicate that they might not have fully connected the dots.

As a community, we’ve talked for years about addressing the vulnerability problems (including across the entire supply chain), but the DNI also talks about threat in the context of problems regarding warning, detection, and attribution. He recommends greater “US Government engagement” with the private sector. This presents other challenges, though, because we have concerns about transparency versus legitimate secrecy needs (just for starters).

In the section on counterintelligence, the report also links cybersecurity to foreign intelligence service activity. I physically laughed out loud at the assessment that “many intrusions into US networks are not being detected“: understatement of the year. The report here adds Iran to the list of countries undertaking cybersecurity operations against the US. The private sector infosec community, outside of the defense industrial base and Stuxnet, hasn’t really paid much attention to Iran. That could change in 2012, particularly if geopolitical tensions continue to increase there.

I didn’t expect any specific data in this document, given its purpose and classification level. But it could point the way to at least some of the areas that could involve many of us in the next few years, and it certainly is useful in validating the idea that we need to improve our abilities in sharing threat intelligence and incident detection & response.

Two Things: SIEM and DFIR edition

Thanks to Hacker News, I ran across the charming and thought-provoking concept of Two Things:

“You know, the Two Things. For every subject, there are really only two things you really need to know. Everything else is the application of those two things, or just not important.”

You also might think of these things as first principles, though these might represent something even more basic. After spending some time thinking about it, I came up with the following. Feel free to add your own or point out what I’ve missed.

Two things for DFIR:

1. The bad guys always leave evidence behind.
2. You aren’t looking for it in time.

Two things for SIEM:

1. Log analysis matters more than log management.
2. SIEM analysts eventually become DBAs. (Bejtlich‘s Principle)

I don’t know whether anybody else has called it that before, but I sure wish I could find the canonical reference for Bejtlich’s Principle.

We can define an analyst as a function taking data and caffeine as inputs that outputs (hopefully useful) knowledge:

$analyst(data,caffeine) \to knowledge$

But analysts need more than just good data and properly brewed coffee (or tea, if that’s your thing). We need well-written “internal code”: our thought processes, if you will. As I’ve previously mentioned, too much material focuses on the data and not enough on the processing. If you look for information on log management, you can find endless advice on how to collect your logs, and how to store them. If you look for information on SIEM systems, you can find lots of vendor “marketecture”, compliance guidance, and so forth – but not enough guidance on what to do with the information you find there.

To find what we really need, two things have to happen. First, we need to look outside the IT security echo chamber. Simply repeating the same endless mantras won’t advance the state of the art at all, but looking at other fields with related problems and finding ways to cross-pollinate certainly can bear fruit. In my view, the intelligence community has spent decades working through similar issues. Some really useful references I’ve found lately include Psychology of Intelligence Analysis (which largely discusses “Tools for Thinking” and “Cognitive Biases”). But another document, Basic Counterintelligence Analysis in a Nutshell, has much better applicability to DFIR. Some things work directly, like the section on “Analytic Traps and Mindsets”, others have simply gone out of date, and other concepts have useful analogues. For example, map analysis usually doesn’t reveal very much if invoked in a geographic context (since network links and physical proximity don’t correlate very well), but when you overlay your data on a network map, it certainly can.

So in February, I intend to take the “Basic Counterintelligence Analysis in a Nutshell” document and adapt the ideas in it to network security investigations in particular. But to do this justice takes more than a simple post, so instead of posting that here as originally intended, I’ll spend some time on it and get feedback when it’s ready. This post mostly serves the purpose of getting it out there so that my colleagues, friends, and readers can hold me accountable next month.

Chroming up the facts: SIEM and IR presentation

Chroming it up doesn't actually make it go faster

I recently had the opportunity to watch the Trends in SIEM and Incident Response presentation from Narayan Makaram with HP (ArcSight), Anthony Di Bello with Guidance (EnCase), and Andrew Hay with The 451 Group. The topic addressed the specific nexus of my professional interests: log analysis and correlation for detecting and responding to incidents. While I’ve followed Hay on Twitter for a long time, I also have worked with both of the sponsoring
products for years.

Trends

The presentation identifies several primary organizational trends:

• trying to close the gap between compromise, detection, and response
• taking a proactive approach
• emphasis on lessons learned through increased visibility
• response automation key to address relentless threats

(I suppose “relentless” is the new “persistent”.)

Hay did a great job addressing issues, largely based on the 2011 Verizon DBIR. Less than 1% of organizations detect data breaches through log analysis, a number which frankly frightens me. We spend millions of dollars on log management for compliance, and then we don’t use them properly. Given how often logs shed light on an incident in hindsight (69%, according to the same study), we know that they contain the proper data and indications. At best, we just don’t know how to make sense of them, and at worst, we don’t even look. (Guess which I believe happens more often.)

On a similar note, around 28% of surveyed organizations use threat intelligence right now. This looks like a massive opportunity to me: sharing data, understanding indicators and how to use them appropriately, and generally climbing the incident response learning curve faster. Threat intel providers and analysts have a huge field of untapped potential awaiting – so, as Hay says, we need to be less Paul Blart – Mall Cop and more Tom Cruise – Minority report.

Di Bello (with Guidance Software) made some important points related to speed of response. He uses a traditional IR timeline, where a call to a help desk leads over several days to a low-level analyst going onsite for data gathering before eventually a senior analyst looks at the data and performs manual forensic analysis. We can’t stick with this model: automated data gathering based on solid alerting and event analysis can speed this up. It’s a great model for the future, and many organizations have started trying to lead the way in this trend. He discusses several example use cases, like suspicious network traffic or DLP alerts.

Inconsistent data

Unfortunately, I found the quality of the rest of the presentation highly variable.  Given their audience, they should take care to confirm the consistency of their data and ensure that their conclusions follow appropriately from the evidence presented. I understand the need for marketing in order for the sponsors to get value from the event, but puffery shouldn’t override the value for the listeners. That disappointed me, as I also use ArcSight heavily in my day-to-day operational analyses and like the product. I also use EnCase Enterprise, though less frequently and with much less satisfaction.

I just present two examples here, but they illustrate the issue that persisted through the entire presentation. This really detracted from the overall value, and I hope that future iterations will focus on the great value of this approach. The message matters and I would like to see it handled well.

For example, the HP speaker had a slide titled “Cybercrime Keeps Growing”. Among other well-publicized security breaches, he listed Google: “Accounts affected: Unknown” and “12.5 billion market cap lost”. This statistic makes me cry, and not for the intended reason. First, which data breach? The most public one that occurs to me would be the Aurora incident, and while that got a lot of press due to the details and geopolitical implications, I don’t believe they lost substantial investor confidence due to that. Second, given the economy of the last few years, attributing any market capitalization loss to this one incident ignores lots of other factors. And third, over what time period did this loss supposedly occur?

All the other listed incidents list specific costs, either financial or relating to a “processing license” revocation. With a bit of time spent on Google (ironically), I can’t find any support for that statement other than ArcSight presentations. And their mention of RBS WorldPay doesn’t seem to note that the PCI Council recertified them not long after. Also: I can’t imagine anybody who would take time out of their day for a presentation on this topic who doesn’t understand the overall risk. These sorts of slides have no value in presentations to this type of audience.

Time to respond also got some discussion, and here the Guidance representative exaggerated wildly. He claimed that EnCase Enterprise can get data from a system to confirm a compromise in seconds. In response to an audience question on this, he repeated the point. I don’t believe that this is the case except for large values of “seconds” (e.g. an hour is 3600 seconds, but that doesn’t seem to have been his intent). Even gathering metadata from memory, not to mention data on persistence mechanisms and core OS files, causes enough of a performance hit that it takes time. By itself, that’s not a knock on EnCase, but on the presentation here. That doesn’t even take into account the licensing limitations with EnCase Enterprise that greatly reduce the number of hosts from which the system can gather data simultaneously, typically in multiples of five.

These examples illustrate the feeling I had throughout, at least after Hay’s segment: not only did it consistent almost entirely of sales pitches, they didn’t even really consider the type of audience who would attend. That said, I’d welcome any corrections to my statements above. Nothing convinces an analyst like data, after all.

(Disclosure: I work for Heartland Payment Systems, also mentioned in the presentation. As always, my opinions here are my own and don’t necessarily reflect those of my employer. And I will re-emphasize that I have received no compensation or other inducements for my opinions on the products mentioned in this post.)

MIR training class

Last week, I took the MIR class from Mandiant. Primarily consisting of product training (as expected and desired), this turned out to be one of the better vendor classes I’ve taken in my career. While I’ve used MIR for close to six months now (and its free predecessor for considerably longer), I still got plenty out of it.

The class runs four full days and starts off with the expected topics like installation, deployment, using the admittedly difficult UI, and related tasks. From there, we delved into responding simulated intrusions. While I learned a few investigative tips, in general this mostly highlighted the platform’s strengths. The class also briefly covered counter-forensics and malware analysis, but at a very high level[1]. The art of writing IOCs and sweeping your enterprise took an entire day and included lots of detail and practice.

I appreciated the instructors’ background: professional IR types with good teaching skills rather than career trainers who pretend to know something about what we really do every day. Slide reading just didn’t occur, and the hands-on exercises take up at least half of the class.

More than anything else, I liked the collection of students in the class. We had about eight “outside students” and four to six Mandiant employees on any given day. But unlike some classes that never engage during the “lecture” portions and go their own way during breaks and lunch, we had lots of great back-and-forth during class, informative lunches, and I like to think that I made several solid professional connections that week.

A few things could improve, some of which have more to do with the product than the course. The room felt a little cramped, for example, and we probably could have used even more time dedicated to searching, filtering, and writing IOCs.

In general, I found the class really valuable and will send more of our staff to the class in 2012. Mandiant doesn’t like it when we talk about when they might offer the class again, so keep an eye on their Twitter feed and web site if this seems like something you could use.

1: I have taken the Black Hat edition of their malware analysis crash course and it’s worthwhile for responders who need to understand the basics and have some background.

Another breakdown of incident response skills

Following closely on the heels of yesterday’s post, Ron Gula (the Nessus dude) tweeted a link to Incident Response: 5 Critical Skills. The breakdown comes slightly differently, as it focuses primarily on tech skills.

1. Collaboration: Exactly what it says on the tin
2. Database Analysis: Much like what I discussed yesterday. This should include many types of structured data, not just traditional RDBMS.
3. Digital Forensics: Every incident responder needs the basic ability to acquire an image of disk and memory and know how to find specific artifacts in it, like files or registry hives.
4. Malware Analysis: Lots of tools on the Internet can help as a first pass, but you can also take a crash course from Mandiant or SANS. This can become an entire in-depth career on its own, but all responders should at least know core tools and how to interpret their output. Having a copy of REMnux close at hand is a plus.
5. User Behavior: I’ll freely admit that this is a weakness for me, but for a start, have a good logging setup and know your organization (e.g. find the domain experts for a given set of applications or systems).

I like this breakdown as well. One person may not excel in all of them, but he should have at least some core competency to understand the issues and do a first pass. And of course a CIRT needs all of these skills within it, including specialists.

Threat analysis skillset

I ran across an interesting little article on The Top 6 Skills For Entry-level Intelligence Analysts. While Wheaton focuses on the “national security, law enforcement and business” intel communities, I think the basic ideas could apply well to infosec threat analysts and incident responders. (While we can distinguish somewhat between those two functions, they generally fall into the same team’s scope in most organizations.)

Caffeine tolerance: another important skill

When we talk about an “entry-level” threat analyst, I find that a little misleading in the IT context. I do not believe you can take somebody fresh out of a tech school or necessarily even university and put them straight into threat intelligence or incident response, because those environments don’t give the necessary background. Individuals may have the needed qualifications when they graduate due to previous experience (*cough*reformed blackhats*cough*) but not due to what they learn in school.

So what skills does he recommend?

1. Analytic methodologies, something that CIRTs rarely use in a formal sense. The methodologies that do exist tend toward the lighter end of the spectrum, which probably works well given the relatively nascent nature of the discipline.
2. Written communication is important, but skills vary widely because IT staff often do not emphasize this as heavily as they should. Many people can’t write an intelligible email to save their life, much less a reasoned and readable briefing.
3. Research methods tend to get a lot of emphasis because they focus on tech skills. Here, I’m thinking of actual forensics and information gathering.
4. Teamwork always has been essential. While small teams tend to work well, we need to keep getting better as a community at cross-organizational collaboration.
5. Oral communication varies widely, though less so than written. Most people can explain things face-to-face, though perhaps the fact that terrible presentation skills seem endemic may give the lie to that.
6. Database usage varies as well, but DFIR folks tend to pick it up quickly as a tech skill. We can do better, as analysts need to learn more SQL. And if an analyst doesn’t understand the core of regular expressions, then he might need more seasoning before even qualifying as entry-level.

I’d welcome your thoughts, particularly pointers to areas that might contradict some of my conclusions about specific skill sets above.