Tag Archives: Amazon Web Services

Introduction to the Collective Intelligence Framework

Just back off dudeCIRTs and related organizations often handle incident detection as well as response. Both of these roles produce and consume threat intelligence in different ways. For example, we often want to correlate our network traffic with OSINT indicators (known bad IP addresses and URLs, MD5 hashes of suspicious files, etc.) I’ve started looking at the Collective Intelligence Framework as a way to fulfill these needs. CIF development is sponsored by the REN-ISAC and National Science Foundation, with most of the coding (and everything else!) handled by Wes Young. Everything is open source for those of us who like – or need – to hack directly on the code.

In this article, I’ll explain CIF, give some usage examples, and discuss test deployment scenarios.

Understanding CIF

From the perspective of a user, CIF allows you to run queries against many data sources at once. If you have other private data sources available, particularly via XML (RSS), JSON, or in a file (e.g. CSV), you can incorporate those, as well as additional OSINT sources. CIF comes preconfigured for:

Use cases include manually querying the database for specific indicators (e.g. “do we have any records for this IP address?”) as well as pulling feeds of various sorts for use by security systems (e.g. “what URLs should we block at the proxy?”). CIF includes concepts of severity and confidence as well as privilege. This allows you to provide feeds of high-confidence public data to some systems while still allowing investigators to query private, unconfirmed data.

Essentially, CIF ingests data – typically on an hourly or data basis, depending on the source – indexes it on the fly for performance reasons, performs correlation analytics (e.g. so that a URL also turns into domain and IP address information), and then makes it available in feeds via various output plugins. These plugins include tables and HTML for viewing by a user, but also IPtables rules, Snort rules, JSON, and CSV for processing by other security systems.

Usage examples

Everything below comes from the Perl client. I haven’t yet dealt with the Python client, much less hacked on it, but that’s coming Soontm.

cif -q infrastructure/malware -c 50 -s medium

gives a fairly large list of IP addresses associated with malware. (I used medium severity and 50% confidence in these examples.)

Even if you don’t use a proxy server, you might find CIF useful for checking suspicious URLs:

cif -q url -c 50 -s medium -p snort

You now have a list of Snort rules to pull into your IDS.

Or if you have your own list of IP addresses to check, such as when an ongoing case has new indicators:

you can put them in a file and query each of them.

for f in `cat hostlist.txt` ; do cif -q $f >> specific-ip.txt; done

This yields another list. You might see a few lines in that example with a “private” restriction and impact as “search”. This happens because, by default, CIF will log every query for a specific indicator. A number of searches, such as from other investigators, may have significance apart from any data. However, if you don’t want CIF to log a query, just use the “-n” parameter.

If you’d like to play with it some more, contact me for an API key and the address of my semi-public CIF server. Twitter or email both work fine.

Appendix: CIF on the Amazon cloud

Amazon Web Services provide a decent platform for testing CIF or running a public instance like mine. The following assumes some familiarity with Linux administration and at least a basic understanding of the Elastic Compute Cloud (EC2).

You can start with a small instance for the installation, but you’ll quickly want to move to a medium instance at least. I run a large instance using the Ubuntu Cloud Guest server image. In general, follow the server install instructions for CIF. You’ll also want to note the specifics for Ubuntu as they contain a few workarounds you will need. Allocate an Elastic IP and register it in DNS someplace, such as with Amazon Route 53. For the Security Group, only add HTTPS and SSH. You won’t need anything else, and I recommend leaving it at this minimal state for security purposes. You’ll also need an Elastic Block Store. While you can start with 10GB, expect that to grow a few GB per week, so you’ll need to resize from time to time or create a larger volume at the beginning. While not required for CIF installation, I can’t recommend enough that you use git to manage config files. Srsly.

When installing Postgres, note that “peer” may appear in the original file instead of “ident sameuser”. Also, I did not use the values in CIF doc, as postgres didn’t like them. I left everything at the defaults except:

work_mem = 512MB
checkpoint_segments = 32

When setting up BIND9, first check /etc/resolv.conf for the IP addresses you should use as forwarders.

Secure HTTP via SSH proxy

Insecure gate
Sometimes, your existing outbound connection doesn’t meet your privacy or security needs. Perhaps you need to use a public wifi network and don’t want to log into something that doesn’t support SSL. Or perhaps you want to log into a site and not have it immediately trace back to your IP address. You can achieve these goals by using an SSH proxy from a server in the cloud.

Before you proceed, though, you should always think about your risk model, as you should anytime you consider whether and how to implement a security control.

  • This process does nothing to secure the connection from your shell server to the endpoint. In other words, this will encrypt your traffic on your local connection but not across the wider Internet. If you just want to log into Reddit without allowing somebody to steal your session cookie, this is okay, but do not depend on this to protect activity that could lead to legal problems in the jurisdiction hosting the server.
  • Law enforcement or other legal processes can still identify you, because you’ll usually be using an account tied to your real life identity (assuming you use Amazon Web Services). You will only be anonymous as long as you don’t do something that could get the legal system involved.

The scope of this post does not include addressing the issue of OPSEC for possibly illegal activities and the ethics of documenting that. However, I will note that activists in truly repressive regimes have a need for secure communications. Perhaps I’ll discuss that in more detail in the future.

A number of good tutorials already exist for this, so I don’t need to document the entire process again. Assuming you use a proper operating system (e.g. a Unix derivative like Linux or OS X), then the process literally takes one command-line argument:

ssh -D 1337 username@server.example.com

Then configure your browser to use localhost port 1337 as a SOCKS5 proxy. If you must use Windows, then you might check out Kimmo Suominen’s Proxy through SSH document.

If you don’t already have access to a shell account someplace, then Amazon Web Services should have you covered. Amazon has a very simple process to set up a new server using their Elastic Compute Cloud (EC2), and you may want one anyway depending on your confidence your existing shell server’s security. I suggest using the default Amazon Linux image on a micro instance. You can use these at no cost for the first year, after which it comes to less than 10 USD per month. The server costs even less if you stop it when you don’t need it.