This post is the tenth in a series, 12 Days of HaXmas, where we take a look at some of more notable advancements and events in the Metasploit Framework over the course of 2014.

The Metasploit Framework uses operating system and service fingerprints for automatic target selection and asset identification. This blog post describes a major overhaul of the fingerprinting backend within Metasploit and how you can extend it by submitting new fingerprints.

Historically, Metasploit wasn't great at fingerprinting. Shortly after the Rapid7 acquisition, we added an internal fingerprinting system to the framework, but we still depended on imports from Nexpose, Nmap, and other external tools to obtain comprehensive results. The only areas where fingerprint coverage was passable were the SMB, HTTP, and web browser rules, since many modules depended on these for automatic configuration. Metasploit has the ability to import data from dozens of external sources, including web application scanners, vulnerability scanners, and even raw PCAP files. Normalizing all of this data was a challenge and the fingerprinting backend had the job of squashing conflicting OS and service names into something that modules could easily understand.

By mid-2013, Metasploit's fingerprints were getting stale and the ruleset was becoming more tangled than ever. Changing one fingerprint required carefully reviewing all of the code paths where a conflicting rule might override the resulting value. New operating systems and services were released and the backend simply wasn't keeping up. For our Metasploit Pro customers, this was less of an issue due to the direct integration with Nexpose and Nmap, but we needed a fresh approach all the same.

Earlier in 2013, my team was looking at whether we could improve our products using existing internet-wide scan data. Our first project involved an overhaul of the Nexpose SNMP fingerprints by leveraging the Critical.IO dataset. Nexpose fingerprints are stored as a series of regular expressions within XML files. These fingerprints were easy to read, write, and test. Over the course of a week we were able to expand Nexpose's SNMP system description fingerprints to cover approximately 85% of the devices found on the internet by the Critical.IO SNMP scan. This was a quick win and made it clear that we should be looking at internet scan data as a primary source of new fingerprints.

In 2014, we took the same approach using the Project Sonar data to add fingerprints for popular HTTP services. Our approach was to sort the raw scan data by frequency, determine which fingerprints would cover the largest number of systems, and then sit down and write those fingerprints. This work improved fingerprint accuracy for our Nexpose customers and provided an opportunity to do targeted vulnerability research on the most widely exposed devices and services. The issues with the Metasploit fingerprints remained, but a plan was starting to come together.

First, we had to get sign-off to open source the Nexpose fingerprint database. Next, we had write some wrapper code that made interfacing with and testing these fingerprints quick and painless. Finally, we had to rip out the existing Metasploit fingerprinting engine, normalize the entire framework to use the new fingerprints, and add some glue code to map Nexpose conventions to what Metasploit expected. This required a major effort across the Nexpose, Metasploit, and Labs teams and took the better part of five months to finally deliver.

The result was Recog, an open source recognition framework. Recog is now the upstream for both Nexpose and Metasploit fingerprints. We will continue to leverage Project Sonar to add and improve fingerprints, but even better, our customers and open source users can now submit new fingerprints of their own. Recog is available under a BSD 2-Clause license and can be used within your own projects, open source or otherwise, and although the test framework is written in Ruby, the XML fingerprints are easy to process in just about every language.

Metasploit users benefit through consistent formatting of third-party data imports, better fingerprinting when using scanner modules, and support for targeting newer operating systems and web browsers. Nexpose users will continue to see improvements to fingerprinting, with several major leaps in coverage as Project Sonar progresses. Metasploit contributors can take advantage of the new fingerprint.match note type to provide fingerprint suggestions to the new matching engine. If you are interested in the mechanics of how Metasploit interfaces with Recog, take a look at the OS normalization code in MDM.

Recog is a great example of Rapid7's commitment to open source and our desire to collaborate with the greater information security community. Although writing fingerprints isn't the most exciting task, accurate fingerprints are a requirement for reliable vulnerability assessments and successful penetration tests. If you are looking for a chance to contribute to Metasploit, or simply want better fingerprinting for systems within your own network, please considering submitting updates to Recog. Feel free to drop by the #metasploit channel on the Freenode IRC network if you would like to chat with the development team. If you have a new fingerprint but don't feel comfortable sending a pull request, feel free to file an Issue within Recog repository on Github instead.

-HD