Tag Archives: Verizon

Far pointers: threat intel concepts and CIF-Maltego edition

Not Grover, although Andy Grove ran Intel whose segmented architecture made them necessary… wow, was Jim Henson trying to tell us something?

I wrote a post on the Verizon Business Security Blog titled Concepts in Sharing Threat Intelligence. You should read it; I hope you like it. Comments over there, please! It makes my bosses happy when you read and comment on my stuff there. And when they’re happy, I’m happy. And when I’m happy, everybody[1] is happy.

Maltego and CIF

So as part of my recent work on all things CIF, I wrote a Maltego transform with a little help from the fantastic Andrew MacPherson. Assuming you already know how to use both, then you’ll have no trouble with this.

In Maltego, in the menu bar near the top, select Manage > Local Transforms. You can call it whatever you like, such as something imaginative like “CIF lookup”, but be sure to specify the “Input entity type” as an IPv4 address. The transform set doesn’t really matter, I don’t believe, but I put it under “IP owner detail” because that seemed to make the most sense to me. Then point Maltego at the script and it should work. You’ll need to have the CIF client in /usr/local/bin or otherwise change the Popen() call in the script.

I have plans for more Maltego transforms (e.g. VirusTotal), but if you run into any issues with this one, or want something changed, please let me know. This will work just fine with Maltego Community Edition, by the way, but I highly recommend buying a Maltego commercial license if you’re doing anything serious with it. The folks there are incredibly responsive and helpful and they deserve something for all their hard work if you’re using it.

[1]: For small values of “everybody”.

Announcement: Career move

For some time, I’ve considered moving away from my current role as a CIRT team leader. As much as I enjoy exercising the analysis bits of my skillset, the operational demands of monitoring and response take a toll on my personal / family life. Despite how much I’ve enjoyed building up the incident response team at my previous employer, the time for a change has arrived. When I looked at my skills with as objective an eye as I could muster, data science seemed to make the most sense. I have an academic background in statistics, applied mathematics, and computer science. My professional background includes UNIX system administration and usage as well a long history with programming (hacking rather than software engineering). Just as importantly, I’ve developed domain expertise in privacy, digital forensics, and network security. I also speak fluent, professional-quality Spanish and have a burning desire to work on stuff that matters in the long view.

As I started to explore the possibility space here, an opportunity arose that I couldn’t ignore. So I am incredibly pumped to announce that, later this month, I will be joining the Verizon Business RISK team, working on security research and intelligence with Chris Porter and Wade Baker. In a very real sense, this also means I will return to one of my very first employers, as I was a GTE employee who stuck around for several years after we merged with Bell Atlantic to form Verizon.

Because of that, a few specific things I’d planned here won’t happen, or at least not anytime soon. I will continue to write this blog actively, but obviously some sorts of writing will fit better into my work at VZ. This blog will temporarily slow down when I officially enter my new role: while I have assurances that it doesn’t conflict with any of my new employer’s policies, I’d like to review the social media policy in detail to avoid stepping into any open manholes.

Thanks for reading and for everyone’s support. This change just means we’re stepping on the gas!

Twitter review: 2012-03-23

While I’m dissecting the Verizon DBIR and the Mandiant M-Trends report, plus preparing for my talk to the NAISG Dallas chapter next week (“Evolution of an IRT”), I thought I’d take a look at some relevant Twitter data.

Storification

First, I assembled a Storify to document a conversation on Twitter today related to those two reports. Take a look at DBIR and M-Trends: Different Perspectives. Credit to @bond_alexander for kicking it off.

Twitter dataviz

I also generated the below visualization using xefer. It shows activity by hour of the day, by day of the week, and the ratios of tweets / replies / retweets, all in US Central time (GMT-6 or -5 during DST).

Click to enlargeinate

A few things jumped out at me:

  • I usually go offline around 11pm and don’t get going again until 7 or 8am the next day. Typical sleep cycle.
  • Twitter activity declines during the noon (lunch) hour.
  • During the 5pm hour, I have very little to say. This represents the time I wrap up my daily work, drive home, and see my family.
  • Activity drops off during the weekend, when I spend time with the family or generally relax (e.g. gaming).
  • Thursday and Friday evenings slow down considerably compared to Monday through Wednesday evenings. I know why that happens on Friday (going out), but not Thursday.
  • Wow, I chat a lot. But if you follow me, you probably knew that already.

Blocking Trending Topics

Lots of us can’t stand to read the “trending topics” on Twitter. They usually revolve around celebrity “news” and other useless bits. If you have Adblock Plus for Chrome or Firefox, though, just add the following two lines to your filter list:

twitter.com##.trends-inner
twitter.com##.wide-trends

Other tweets this week

A few relevant Twitter postings:

Next time, I’m doing that in Storify.

Chroming up the facts: SIEM and IR presentation

Chroming it up doesn't actually make it go faster

I recently had the opportunity to watch the Trends in SIEM and Incident Response presentation from Narayan Makaram with HP (ArcSight), Anthony Di Bello with Guidance (EnCase), and Andrew Hay with The 451 Group. The topic addressed the specific nexus of my professional interests: log analysis and correlation for detecting and responding to incidents. While I’ve followed Hay on Twitter for a long time, I also have worked with both of the sponsoring
products for years.

Trends

The presentation identifies several primary organizational trends:

  • trying to close the gap between compromise, detection, and response
  • taking a proactive approach
  • emphasis on lessons learned through increased visibility
  • response automation key to address relentless threats

(I suppose “relentless” is the new “persistent”.)

Hay did a great job addressing issues, largely based on the 2011 Verizon DBIR. Less than 1% of organizations detect data breaches through log analysis, a number which frankly frightens me. We spend millions of dollars on log management for compliance, and then we don’t use them properly. Given how often logs shed light on an incident in hindsight (69%, according to the same study), we know that they contain the proper data and indications. At best, we just don’t know how to make sense of them, and at worst, we don’t even look. (Guess which I believe happens more often.)

On a similar note, around 28% of surveyed organizations use threat intelligence right now. This looks like a massive opportunity to me: sharing data, understanding indicators and how to use them appropriately, and generally climbing the incident response learning curve faster. Threat intel providers and analysts have a huge field of untapped potential awaiting – so, as Hay says, we need to be less Paul Blart – Mall Cop and more Tom Cruise – Minority report.

Di Bello (with Guidance Software) made some important points related to speed of response. He uses a traditional IR timeline, where a call to a help desk leads over several days to a low-level analyst going onsite for data gathering before eventually a senior analyst looks at the data and performs manual forensic analysis. We can’t stick with this model: automated data gathering based on solid alerting and event analysis can speed this up. It’s a great model for the future, and many organizations have started trying to lead the way in this trend. He discusses several example use cases, like suspicious network traffic or DLP alerts.

Inconsistent data

Unfortunately, I found the quality of the rest of the presentation highly variable.  Given their audience, they should take care to confirm the consistency of their data and ensure that their conclusions follow appropriately from the evidence presented. I understand the need for marketing in order for the sponsors to get value from the event, but puffery shouldn’t override the value for the listeners. That disappointed me, as I also use ArcSight heavily in my day-to-day operational analyses and like the product. I also use EnCase Enterprise, though less frequently and with much less satisfaction.

I just present two examples here, but they illustrate the issue that persisted through the entire presentation. This really detracted from the overall value, and I hope that future iterations will focus on the great value of this approach. The message matters and I would like to see it handled well.

For example, the HP speaker had a slide titled “Cybercrime Keeps Growing”. Among other well-publicized security breaches, he listed Google: “Accounts affected: Unknown” and “12.5 billion market cap lost”. This statistic makes me cry, and not for the intended reason. First, which data breach? The most public one that occurs to me would be the Aurora incident, and while that got a lot of press due to the details and geopolitical implications, I don’t believe they lost substantial investor confidence due to that. Second, given the economy of the last few years, attributing any market capitalization loss to this one incident ignores lots of other factors. And third, over what time period did this loss supposedly occur?

All the other listed incidents list specific costs, either financial or relating to a “processing license” revocation. With a bit of time spent on Google (ironically), I can’t find any support for that statement other than ArcSight presentations. And their mention of RBS WorldPay doesn’t seem to note that the PCI Council recertified them not long after. Also: I can’t imagine anybody who would take time out of their day for a presentation on this topic who doesn’t understand the overall risk. These sorts of slides have no value in presentations to this type of audience.

Time to respond also got some discussion, and here the Guidance representative exaggerated wildly. He claimed that EnCase Enterprise can get data from a system to confirm a compromise in seconds. In response to an audience question on this, he repeated the point. I don’t believe that this is the case except for large values of “seconds” (e.g. an hour is 3600 seconds, but that doesn’t seem to have been his intent). Even gathering metadata from memory, not to mention data on persistence mechanisms and core OS files, causes enough of a performance hit that it takes time. By itself, that’s not a knock on EnCase, but on the presentation here. That doesn’t even take into account the licensing limitations with EnCase Enterprise that greatly reduce the number of hosts from which the system can gather data simultaneously, typically in multiples of five.

These examples illustrate the feeling I had throughout, at least after Hay’s segment: not only did it consistent almost entirely of sales pitches, they didn’t even really consider the type of audience who would attend. That said, I’d welcome any corrections to my statements above. Nothing convinces an analyst like data, after all.

(Disclosure: I work for Heartland Payment Systems, also mentioned in the presentation. As always, my opinions here are my own and don’t necessarily reflect those of my employer. And I will re-emphasize that I have received no compensation or other inducements for my opinions on the products mentioned in this post.)

Hunting trips: network traffic log analysis

Log analysis has always struck me as one of those things that gets too much superficial attention without enough attention to detail. That is, we know that we need to do it, but we don’t talk about how we need to do it. At best, we talk about making sure we collect and archive logs. Analysis plays second fiddle, even though in reality logs without analysis provide almost no value to an organization. And you’ll find greatest value in discovery of the earliest stages of an incident rather than in hindsight to understand what went wrong. Unfortunately, less than 1% of data breach investigations in the 2011 Verizon DBIR started with log analysis and review!

The analysis ideas I present below don’t even begin to represent a comprehensive view. And of course every network is different, so you will need to think about your specific needs. But this may get you thinking in directions you hadn’t previously considered. Side benefits include analysts becoming more proficient with their tools, pushing the limits and gaps in their toolset, creating baselines of their environment, and even mentoring via shared hunting trips. These could serve as foundations for SIEM use cases, but here we’re talking about active exploratory usage by an analyst.

Hunting trips in DFIR involve actively looking for possible anomalies or indications of compromise on your network. Even if you don’t find anomalies, you’ll get a better understanding of your baselines. In this post, I’ll talk about hunting through your network traffic logs. Richard Bejtlich talks about hunting through systems as well, but I’ll save that particular discussion for another day. Further, if you do this by having a junior analyst “tag along” with a more experienced analyst (e.g. via screen sharing and chatting), you get the regular benefits of good analysis plus team-building and training.

Egress traffic

First, and most importantly, always keep in mind that we’re only identifying anomalies, not automatically classifying “bad” traffic. Nothing here can positively and without question find evil with no false positives or false negatives. It should, however, increase your efficiency in finding things that violate your policies or possibly indicate a compromise.

Compromised systems may start sending out traffic that doesn’t look like the rest of your traffic. Perhaps an attacker is trying to exfiltrate data, or a bot may simply try to contact its C&C infrastructure. So look carefully at outbound traffic logs from your perimeter firewalls. Good protocol candidates include SSH, SMTP, and IRC (yes, even now in 2011). In fact, examine all non-HTTP traffic from user subnets with suspicion.

Also look for protocol-port mismatches. Do you have HTTP traffic on high ports, or maybe even something like SSH on TCP 80? Attackers often like to overload TCP 80 to slip through loosely secured perimeter networks.

Web traffic has some unique problems. Not only does it involve a constantly changing set of endpoints, protocol evolution means that HTTP isn’t really the top-level protocol in the stack anymore. Development has rapidly left behind simple GETs and PUTs, and things like WebSockets overload ports beyond what you may realize. Still, try to analyze this traffic because so much malicious activity uses this channel.

For outbound surfing, look at your User-Agent strings: lots of spyware browser extensions will show up here. Some malware tries (poorly) to look like regular browsers and you can sometimes find it through misspellings or anomalies like default languages. A good proxy may do this, but mining the data yourself can find new threats. Look at the domains that users hit as well. Check URLs against external APIs but beware. If you get the chance and it fits your network or organization, look at destination geolocation. You may identify suspicious traffic by its destination country – if you sell widgets to farmers in Iowa, then outbound traffic to Eastern Europe or the Asia-Pac region is worth a second look. For both of these areas, applying the principle of Least Frequency of Occurrence can greatly reduce the dataset you actually need to review.

Ingress traffic

Inbound traffic to your web servers should get a close look too, using similar analysis methods as we discussed for outbound web traffic. However, take a close look at your URI query strings to find people attempting SQL injection or other forms of attack (hint: look for really long payloads). You may wish to review user agents here as well, though your mileage may vary if you run a popular web site or one with lots of global exposure. This will have particular effectiveness when analyzing traffic to API servers.

Consider looking at source geolocation as well, though as before, don’t fall into traps. In some organizations, working with your marketing or web analytics team can help you understand things and clarify your assumptions here.

The effectiveness of this part of the review may vary according to your threat model and overall security posture. For example, if you don’t have a good application security program, or if you have few users on your network, this area will matter more than egress traffic. Conversely, if you have very few exposed services, this may not deserve as much effort.

Baselines

Create some network flow baselines. You can’t know what’s anomalous until you know what’s normal. A word of caution here, though: don’t assume your baselines are already secure. You might have an existing but previously-unknown compromise. So spend time with your system administrators to identify traffic flows that don’t have an immediately obvious purpose.

What does traffic in and out of your desktop networks look like? These will necessarily differ significantly from your server networks, which need the same sort of attention. What systems usually talk to each other? Do they contact a particular set of authorized external hosts (e.g. for updates and such), especially with a defined frequency? What’s the traffic distribution across various ports? Does this vary with time of day, or day of the week?

You’ll start to build a framework of known good traffic to exclude from future analyses. As the US military teaches, the more you sweat in preparation, the less you may bleed in battle.

Conclusion

Log management matters, but log analysis matters more. Even if you have a relatively limited dataset available, start with what you have. Like tugging on the proverbial sweater thread, you will find that a little effort at the beginning can quickly unravel more than you initially might have guessed.

In the future, I’ll talk about hunting trips through your systems and other types of security data. But at any time, I welcome your thoughts and suggestions!

Overview of incident and threat reporting standards

"..." by Pom²I’ve spent a lot of time looking into standards for sharing information about incidents as well as detailed threat data lately. As it turns out (and as one would expect), lots of smart people have built some useful tools for sharing this information. So I thought I’d talk a little about what I’ve found and how various standards can work together in a stack.

Lately, the new OpenIOC standard has gotten some discussion. This is an XML schema that one can use to describe specific threat signatures: MD5 hashes, mutexes, registry keys, and the like. If an organization wants to share information categorizing a particular piece of malware, say, or other ways to identify a system that has been compromised by a particular threat, then IOC does that well. It’s the sort of thing that ThreatExpert could use to provide signatures for the malware it analyzes, or an investigator could use to describe artifacts left by a particular attack. I don’t know of other standards that hit this particular pain point, though I’d love for someone to point them out to me.

Now some of us have asked how this compares to IODEF, an IETF standard that describes an entire incident. CIRTs could exchange IODEF information about a particular attack: attacker identities, targeted assets, vulnerabilities and exploits, impact on the affected assets, contact information, etc. In fact, I believe that IOC could fit into IODEF to describe the indicators that can characterize a particular incident, but IODEF includes much more. To use a networking analogy, IOC is to IODEF as HTTP is to TCP. Or to take a law-enforcement approach, IODEF represents the police report for an incident and IOC represents the fingerprints found on the scene.

For those familiar with VERIS, an information-sharing framework originally developed by Verizon. Unlike the other two standards, however, VERIS tries to organize the data into high-level metrics: demographics of the victim (e.g. organization type, industry, staff size), A4 incident classification (agent, action, asset, attribute), and that sort of thing. This doesn’t yield actionable intelligence, but it does help us analyze trends in the overall threat landscape. To carry on the previous analogies, VERIS corresponds more to traffic flow statistics or to the FBI Uniform Crime Reports.

All of these standards, and others like them, have a role to play in helping defenders share useful information and collaborate appropriately. In a future post, I’ll talk about some relevant tools that use these standards.