Tag Archives: Threat Intelligence

Ethics versus economics for security research

Independent security researchers often have a reputation as narcissistic vulnerability pimps (true or not), but the environment which has evolved around information security largely drives this. This came to a head for me tonight in a Twitter discussion kicked off by Steve Werby:

Creating an exploit can often pay anywhere between 1k and 100k (or possibly more in specific circumstances), depending on the researcher’s choice of market and product (or technology). This even affects areas that many users believe unrelated, like mobile OS jailbreaks, which essentially consist of exploits to gain root control despite the operating system’s best efforts to the contrary.

No equivalent market exists for threat-related research. Freelance malware analysts don’t have similar economic drivers because organizations with an interest in this information generally do the research themselves. You can’t monetize malware or attribution the same way. Put another way, nobody believes that Krebs and Danchev get rich from what they do. I don’t think we can “fix” this with the market, although I’d welcome discussion of ideas or evidence to the contrary. But we need to recognize this when thinking about issues around software security and threat identification.

Believing that security, on its own, adds value often turns into a form of the broken windows fallacy. And creating artificial demand for threat intelligence could lead to all sorts of perverse incentives. Some of the same organizations interested in purchasing vulnerabilities and exploits might have an interest in highly-focused intelligence, such as espionage on particular threat groups, but at this point the line between “offense” and “defense” becomes very fuzzy.

I’d love to hear alternate viewpoints and suggestions on where this can go.

Pizza with a bad taste: BHEK intel

pizza failI got some spam today that made me hungry (even after eating real spam so many times as a kid).

You've just ordered pizza from our site

[snipped yummy but long listing of pizzas and drinks including crappy beer]

If you haven’t made the order and it’s a fraud case, please follow the link and cancel the order.
CANCEL ORDER NOW!

If you don’t do that shortly, the order will be confirmed and delivered to you.

With Respect
AZZO`s Pizzeria

However, I wasn’t really worried about the fraud possibility, so I decided to ignore the spam and instead to take the opportunity to run the URL through thug. It performed spectacularly well, grabbing the page, finding the exploits (at least some of them, anyway), and keeping everything neat, orderly, and secure.

hxxp://sweety-angel[.]de/local.htm redirects to hxxp://gimalayad[.]ru:8080/forum/links/column.php, which loaded a Java applet, a Flash file, and two PDF documents. At the time I ran them, VirusTotal hadn’t seen them before but a few engines identified the PDFs and the Flash file as part of the Black Hole Exploit Kit. I found the use of old Adobe Reader vulnerabilities (2010 vintage) a little humorous. Contact me via Twitter or email if you’d like the actual files. I published the IOCs as a Google Doc for reference.

Brain dump of DFIR and network security research ideas

Maybe I could get more of these done with this.

Maybe I could get more of these done with this.

I’ve seen several people talk about lacking ideas for research projects, often around DFIR or network security. Personally, I have the opposite problem: endless ideas for projects, often with the barest hint of a start, but not enough time to pursue them all. So I thought I’d publish a bit of a brain dump. I actually have made good progress on a few of these, and I have concrete plans around others (beyond just “wouldn’t it be cool if…”), but in any case I’d love to see other people pick them up and run with them.

If you do happen to get interested in any of the following, I wouldn’t mind a quick note to touch base to see about possibilities for collaboration or at least an acknowledgement in whatever you publish. Don’t interpret that as any sort of requirement, though; ideas have no value without execution, so all the hard work hasn’t even begun.

  • Malware
    • Classification across a large corpus
    • Automated IOC extraction and publication
  • Threat Actors
    • Profiling systems, particularly based on OSINT
    • Underanalyzed crime groups (e.g. drug cartels involvement in malware, spam, and fraud)
    • Hacktivism motivations and methods
  • Passwords
    • Cracking lab setups
    • Useful entropy calculations
  • Quantitative analysis of incidents
    • DDOS attacks (hard to get numbers on these)
    • Defacements and low-level leaks
  • Active Defense
    • Honeypots and honeyclients
    • Vocabulary or taxonomy on various methods
    • Callback Trojans in documents
    • C2 / RAT vulnerability research

Comments on Comment Crew

Everyone paying any attention to security this week noted Mandiant’s report on the Comment Crew. If you haven’t, go read it first. I’ll wait.

Why You Make Groundless Accusations?Although I work for a competitor[1], I believe Mandiant did the right thing here. Others may disagree to an extent for good reasons, while others simply went too far in their assumptions and criticisms. (And some folks just need to take off the tinfoil hats). I don’t really care that much about what makes the sekrit skwirl cabal happy, and in fact it tickles me when they get frustrated by “outsiders” (inasmuch as Mandiant is one, anyway) not playing by their rules. In any case, healthy skepticism regarding someone else’s conclusions keeps them honest, but don’t miss the big picture out of myopia. The relative prevalence of espionage and APT relative to regular criminal activity remains an open research question and a valid area of debate, but I’ve seen some really smart people this week falling into the cliché of missing the forest for the trees.

Instead, this means the adversary can’t dictate the pace and terms of the conflict, whether or not they completely retool. By driving up the cost to the attacker over time, you start to make headway. That works both ways, of course, and at the moment that balance leans decidedly in their favor. Releasing the IOCs will also allow defenders to discover additional compromises. Remember that opponents make mistakes, and so we can capitalize on the opportunity for ongoing intel gathering as they transition to new infrastructure (assuming they even bother).

Sharing information has more than just tactical value. In my view (obviously not one shared by Congress), this points out that we don’t need the government to get in the way with CISPA or other information-sharing that stays behind walls of overclassification or possibly creates additional privacy and civil rights issues. We can do this the right way and improve things. Partisan politics lies way outside the scope of this blog, but I certainly see this as “we’re from the government and we’re here to help” territory.

[1]: As usual, these represent my opinions only. And that’s only good for today anyway because I may change my mind as new facts come to light or I think about topics more thoroughly.

Far pointers: threat intel concepts and CIF-Maltego edition

Not Grover, although Andy Grove ran Intel whose segmented architecture made them necessary… wow, was Jim Henson trying to tell us something?

I wrote a post on the Verizon Business Security Blog titled Concepts in Sharing Threat Intelligence. You should read it; I hope you like it. Comments over there, please! It makes my bosses happy when you read and comment on my stuff there. And when they’re happy, I’m happy. And when I’m happy, everybody[1] is happy.

Maltego and CIF

So as part of my recent work on all things CIF, I wrote a Maltego transform with a little help from the fantastic Andrew MacPherson. Assuming you already know how to use both, then you’ll have no trouble with this.

In Maltego, in the menu bar near the top, select Manage > Local Transforms. You can call it whatever you like, such as something imaginative like “CIF lookup”, but be sure to specify the “Input entity type” as an IPv4 address. The transform set doesn’t really matter, I don’t believe, but I put it under “IP owner detail” because that seemed to make the most sense to me. Then point Maltego at the script and it should work. You’ll need to have the CIF client in /usr/local/bin or otherwise change the Popen() call in the script.

I have plans for more Maltego transforms (e.g. VirusTotal), but if you run into any issues with this one, or want something changed, please let me know. This will work just fine with Maltego Community Edition, by the way, but I highly recommend buying a Maltego commercial license if you’re doing anything serious with it. The folks there are incredibly responsive and helpful and they deserve something for all their hard work if you’re using it.

[1]: For small values of “everybody”.

Thoughts on STIX

Carrot Bomb

Threats should be properly understood.

At Black Hat, I saw a presentation by Sean Barnum from MITRE on Structured Threat Information eXpression (STIX). This builds on their work with Cyber Observable eXpression (CybOX), another standard that more or less exists at the same level in the “stack” as OpenIOC. I don’t think I can say much right now about the differences and possible advantages of CybOX or OpenIOC over the other, mostly because I don’t think I understand them quite to that level of detail.

But STIX tries to organize that IOC / observable information and give it some context. For example, using STIX/CybOX, you could collect together a set of observables, link them together into an indicator, and then associate that indicator with known TTPs. That TTP would possibly link to other indicators to help with confirmation and attribution to a threat actor or campaign.

I’d like at some point soon to build up a “link database”, something like a triplestore (graph database) that represents all its data in a Subject-Predicate-Object form.

$hash BELONGSTO $malware
$twitterhandle BELONGSTO $person
$ipaddress ISA $sshscanner

(This probably doesn’t look anything like the eventual ontology, and my syntax may or may not fit the RDF standard, but it should illustrate the idea.)

As we build up these links, now we start to have a database of objects (possibly using RDF or similar) that link together in various ways, allowing you to perform appropriate analyses to identify clusters of knowledge. This alternate structure probably isn’t usable for STIX per se, but that doesn’t mean a STIX document couldn’t be an object. For example, if we have a STIX document explaining the details of a piece of malware, you could represent that as

$malware ISDESCRIBEDBY $stixdocument

And then once we get to that $malware node in your graph DB, now we know where to go for the full low-down. This would work the same if we have a document (dossier) describing a particular person.

To visualize it, think of something like a Maltego or Casefile graph, only much larger than you can reasonably manipulate with that tool. Which is not a criticism; I love Maltego and use it almost every day, but it is a tool for a particular set of use cases. Ideally, you could export a subgraph to a Maltego file, or of course the inverse: use Maltego to do a bunch of research on something, then import all that data into your DB for archival and possible linking to other nuggets of information.

I’ve done some very exploratory work on this, mostly just prototype code related to working with Maltego transforms. Of course my involvement with CIF also has provided some great lessons of all sorts. But I think that, as a question of knowledge management, we can do better than just tracking very basic data (like we do with CIF) or spending a lot of time on enumeration and detailed XSDs. Both of those matter a great deal, and I work on them frequently in my day job, but they just scratch the surface of what we can do with modern data and knowledge management techniques.

Spring cleaning around here

A few notes about updates around here:

  • I’ve renamed the site to Threat Thoughts and now have a real domain. The site has evolved into (mostly) covering a specific niche and I wanted to show that. I also have decided to keep the design (theme) but put a new header graphic up there.
  • Since this blog tends to focus on threat analysis and related topics, I decided not to include detailed content on Python, functional programming, data analysis, and such. Instead, I have a new blog called Functional Data that deals with those things. Feel free to subscribe or follow me over there. The first few posts mostly deal with Project Euler but give a good taste of what it will cover.
  • As I run across threat-related links that don’t necessarily warrant an in-depth blog post, I post them to my new tumblelog Threat Intel Stream. I haven’t hooked it up to Twitter yet; actually, I haven’t even decided whether to do that.
  • Because of my renewed focus on programming, I’ve been spending more time on GitHub and Hacker News (where I mostly just lurk but comment occasionally).Of course, other sites like Reddit have fallen by the wayside because I only have so many hours in the day…

Hope all continues well for all of my friends and colleagues reading this.

Sapho: threat intelligence tool

Dunecat

We all need Sapho juice sometimes.

Poking around GitHub one night for interesting projects, I ran across Sapho. I dug into it more and found that my fellow tweep Scott Roberts had written it, which only heightened my interest.

Sapho was built as an off hours project to manage intelligence developed from computer network defense activities and third party sources. Building up on the considerable resources of DokuWiki Sapho automatically generates a framework of wiki resources for capturing and analyzing cyber threat intelligence and responding.

Sapho as I understand it consists solely of a template generator for DokuWiki to help you track intrusion campaigns, adversaries & groups, and targeted malware. Unlike Collective Intelligence Framework and other tools, Sapho primarily exists as a way for humans to review the intelligence rather than other systems. If I tell another analyst that a given intrusion appears tied to group alpha, for example, then he can easily review what we know about them specifically. Of course, intelligence groups with even basic competence do this to some degree already, but Sapho allows you to create a common structure for these data.

Given the tool’s simplicity, then, we could extend it in a lot of useful ways. Scott outlines a few other potentially-related tools on the project site, like log2timeline, Cuckoo Sandbox, and Maltego / Casefile. For example, Sapho could automatically ingest the reports from these and reformat them into DokuWiki syntax. I can imagine an output plugin for CIF that does something similar for its data.

Essentially, Sapho could become a tool to transform analytical output into a common human-readable format. Taking it a step further, it could recognize certain indicator types like a hash, IP address, or similar, and automatically create wiki pages for them as a sort of correlation method.

The approach here does not really scale to large databases – but I don’t think it should. This sort of intelligence analysis works best when looking at the operational level rather than a very large scope like that of ThreatExpert. And since the tool uses the Simplified BSD license, you can take the idea and even the basic code and turn it into whatever works for you.

Coopetition and sharing threat intelligence

Imagine a street market with lots of vendors hawking their wares. Customers wander in and out of the market, some of whom you don’t see every day while you know others as regular visitors. Perhaps you are one of several selling coffee beans1. Now imagine that you’ve realized that there’s a thief in the market, and you know more or less what he looks like or perhaps a little about his modes of operation. It’s in your interest to let the other coffee bean sellers (and perhaps even other vendors) know, along with perhaps the local police, because you don’t want that thief robbing you, your suppliers, or your customers – nor your competition.

Some of my recent thinking about sharing and cooperation stems from recent discussions about the CISPA and similar initiatives, while some of it stems from thinking about the fact that, in many areas of business, we frequently compete with organizations whose employees we may consider friends. And of course, competition in business should only go so far. I subscribe to the belief that “there’s no such thing as business ethics” in the most positive sense: we cannot simply limit our ethical behavior to certain areas of life, then turn around and act unethically in other areas.

Sorry baby! Gotta go save the Internet!All of that musing sets the stage for thinking more about sharing threat intelligence. Clearly, we never want to share threat intelligence with the adversaries that may pose a threat to us. This explains why most experienced incident responders recommend not sending malware samples immediately to an antivirus vendor, particularly during an open investigation: that intelligence can easily leak back to the attacker and compromise your operational security. At the same time, we can find benefits in sharing data with our ostensible competition. For example, payment processors have formed a group within the FS-ISAC to share “information about fraud, threats, vulnerabilities and risk mitigation in the payments industry”. Yes, this means that corporations that compete doggedly for merchant accounts and transaction fees will help each other with security intelligence, since that information has more value when aggregated: each processor gets more intel from the group than they put into it. As a result, the marketplace can function more cleanly, to the benefit of all (honest) participants.

That doesn’t mean that an organization should share all of its security secrets. Generally speaking, we can say that the operational security risk from sharing intelligence has an inverse correlation with the specificity of the intelligence. So discussing the (fairly well-known) idea that a lot of fraud originates in Russia and Eastern Europe doesn’t increase the risk to an organization. Sharing information about specific BINs with extremely high fraud levels might incur slightly more risk, but not much (and that primarily from an operational or possibly legal perspective, rather than technical). When we start sharing indicators of compromise and known attacker addresses, then we have to take greater care to ensure that the information doesn’t leak to the adversary. But again, the adversary here isn’t the company next door trying to expand their market share, possibly at the expense of yours. The adversary wants information from both of you, to the detriment of others in the marketplace like cardholders, merchants, and so on.

I don’t quite know what I think about how this might extend to groups (including vendors) whose business includes collecting and selling threat intelligence, including my own employer2 and other companies with which I’ve maintained good working relationships. But I do think that there’s value in some level of cooperation even among these groups, and I’m interested to know what others think.

1: Despite my surname, I don’t have any affiliation with Maxwell House Coffee, and I don’t even drink their stuff. I just like thinking about coffee. Mmm, coffee.
2: To repeat what should be obvious, my opinions here are my own, if anyone’s. Sometimes I end up not even agreeing with myself, so don’t expect that anybody else will!

Introduction to the Collective Intelligence Framework

Just back off dudeCIRTs and related organizations often handle incident detection as well as response. Both of these roles produce and consume threat intelligence in different ways. For example, we often want to correlate our network traffic with OSINT indicators (known bad IP addresses and URLs, MD5 hashes of suspicious files, etc.) I’ve started looking at the Collective Intelligence Framework as a way to fulfill these needs. CIF development is sponsored by the REN-ISAC and National Science Foundation, with most of the coding (and everything else!) handled by Wes Young. Everything is open source for those of us who like – or need – to hack directly on the code.

In this article, I’ll explain CIF, give some usage examples, and discuss test deployment scenarios.

Understanding CIF

From the perspective of a user, CIF allows you to run queries against many data sources at once. If you have other private data sources available, particularly via XML (RSS), JSON, or in a file (e.g. CSV), you can incorporate those, as well as additional OSINT sources. CIF comes preconfigured for:

Use cases include manually querying the database for specific indicators (e.g. “do we have any records for this IP address?”) as well as pulling feeds of various sorts for use by security systems (e.g. “what URLs should we block at the proxy?”). CIF includes concepts of severity and confidence as well as privilege. This allows you to provide feeds of high-confidence public data to some systems while still allowing investigators to query private, unconfirmed data.

Essentially, CIF ingests data – typically on an hourly or data basis, depending on the source – indexes it on the fly for performance reasons, performs correlation analytics (e.g. so that a URL also turns into domain and IP address information), and then makes it available in feeds via various output plugins. These plugins include tables and HTML for viewing by a user, but also IPtables rules, Snort rules, JSON, and CSV for processing by other security systems.

Usage examples

Everything below comes from the Perl client. I haven’t yet dealt with the Python client, much less hacked on it, but that’s coming Soontm.

cif -q infrastructure/malware -c 50 -s medium

gives a fairly large list of IP addresses associated with malware. (I used medium severity and 50% confidence in these examples.)

Even if you don’t use a proxy server, you might find CIF useful for checking suspicious URLs:

cif -q url -c 50 -s medium -p snort

You now have a list of Snort rules to pull into your IDS.

Or if you have your own list of IP addresses to check, such as when an ongoing case has new indicators:

you can put them in a file and query each of them.

for f in `cat hostlist.txt` ; do cif -q $f >> specific-ip.txt; done

This yields another list. You might see a few lines in that example with a “private” restriction and impact as “search”. This happens because, by default, CIF will log every query for a specific indicator. A number of searches, such as from other investigators, may have significance apart from any data. However, if you don’t want CIF to log a query, just use the “-n” parameter.

If you’d like to play with it some more, contact me for an API key and the address of my semi-public CIF server. Twitter or email both work fine.

Appendix: CIF on the Amazon cloud

Amazon Web Services provide a decent platform for testing CIF or running a public instance like mine. The following assumes some familiarity with Linux administration and at least a basic understanding of the Elastic Compute Cloud (EC2).

You can start with a small instance for the installation, but you’ll quickly want to move to a medium instance at least. I run a large instance using the Ubuntu Cloud Guest server image. In general, follow the server install instructions for CIF. You’ll also want to note the specifics for Ubuntu as they contain a few workarounds you will need. Allocate an Elastic IP and register it in DNS someplace, such as with Amazon Route 53. For the Security Group, only add HTTPS and SSH. You won’t need anything else, and I recommend leaving it at this minimal state for security purposes. You’ll also need an Elastic Block Store. While you can start with 10GB, expect that to grow a few GB per week, so you’ll need to resize from time to time or create a larger volume at the beginning. While not required for CIF installation, I can’t recommend enough that you use git to manage config files. Srsly.

When installing Postgres, note that “peer” may appear in the original file instead of “ident sameuser”. Also, I did not use the values in CIF doc, as postgres didn’t like them. I left everything at the defaults except:

work_mem = 512MB
checkpoint_segments = 32

When setting up BIND9, first check /etc/resolv.conf for the IP addresses you should use as forwarders.