At my organization, much of the threat intelligence we receive consists of one-off reports that require some manual processing. But we also receive a weekly consolidated list, which primarily consists of IP addresses, domain names, URLs, file names, and MD5 hashes in a CSV file.
While we’ve considered bringing threat intel data into our ArcSight SIEM implementation for some time, until lately we did this manually as well. This can cause lots of difficulties: staff falling behind due to the volume of other work we perform, individual variations in process, and lack of ongoing correlation. So I decided to implement a small script to make our lives easier.
Originally, I’d planned on modifying ArcOSI for this purpose. ArcOSI scrapes several open source intel feeds, runs a regex on them for IP addresses and domain names, then streams the data in ArcSight’s Common Event Format (CEF) over syslog for proccessing and correlation. As it turned out, my data have enough consistency that I didn’t need all the logic and configuration that ArcOSI provides. I ended up gutting the whole thing and writing a Python script more or less from scratch, keeping only the four lines that actually communicate with the Collector via UDP. (The csv module in Python sped this up tremendously.)
I also found that CEF actually makes this fairly easy. Basically, you just send a syslog message as follows:
CEF:Version|Device Vendor|Device Product|Device Version|Signature
ID|Name|Severity|Extension
The Name field should be fairly generic (“Port scan”, not “Port Scan from 10.1.1.1 against 10.2.2.2″). The Extension field actually consists of key-value pairs. ArcSight includes some predefined keys like src & dst (source address / destination address, and you can define additional keys plus include a detailed message. So, in my script, I emit two signatures: “Known Malicious IP Address” and “Known Malicious Domain Name”. I include some other data in the Extension fields such as the indicator’s tracking ID and the date that it appeared in our intel feed.
Now we just need to create an Active List in ArcSight ESM for these indicators and correlate appropriately. When an event comes in that correlates to one of these indicators, we’ll generate a notification (at least as a first pass). We also have an engineer building out appropriate reporting, trending, and dashboards.
Think about your own intel feeds and whether it makes sense to use ArcOSI or write your own script to parse the feeds into CEF.




I work in an org that does something similar. I have a daily cron job that polls web sites that maintain lists of malicious domains. These are imported directly into an ArcSight list, it’s around 70K domains usually, but varies daily. We correlate our web filters, to see what hosts in our organization are communicating with known malicious domains. We don’t catch everything this way, certainly, but we do catch lots of stuff.
Do you use a file connector or something else to import into ArcSight?
We do it through a connector with a Velocity template. I set up the original connector, that took in the malicious domains list as events, triggering a rule and populating a list that we could then use for reporting in ArcSight. A more knowledgeable colleague of mine re-worked it recently so the connector now has a velocity template and the malicious domains are added directly to a list without the need for a rule. Exactly how that works is yet beyond me.
Cool, I’ll look into that. I feel like I’m still coming up to speed with the package as well, although we have an outside vendor that helps us manage it.
You should definitely check out CEF: http://code.google.com/p/collective-intelligence-framework/
Pingback: 2012 in review | Threat Thoughts