Having spent a lot of my career managing hundreds of Linux servers at a time, I can honestly say that the part I miss the least is running e-mail services. When you run your own e-mail servers, not only do you have to manage half a dozen services, but you also deal with all of the crazy mechanisms to filter spam, keep your IP addresses off blacklists, and ensure deliverability of outbound e-mail. Because spammers and criminals love to leverage well known and respected domain names, it's pretty common for them to spoof a domain when sending e-mail to help trick users into clicking links and sending details back to the bad guys.
Not all hope is lost, though, as spam filtering technologies have gotten a lot better the past few years and new protocols exist to help determine if e-mail being received actually came from where it said it did, without having to dive into a deep nest of e-mail headers. One such popular protocol is called Sender Policy Framework (SPF). With SPF, domain owners can utilize a DNS TXT record with a special format that explains to third parties what IP addresses and hostnames are expected to send e-mail on their behalf. Effectively, DNS is being used as a trust anchor to say, "I control myfavoritedomain.com and I am telling you that only x.x.y.y and x.x.y.z should be sending e-mail on behalf of this domain" in much the same way DNS tells an e-mail server where to deliver e-mail for that domain via a DNS MX record.
As an example, here is Google's SPF entry.
google.com. 3600 IN TXT "v=spf1 include:_spf.google.com ip4:126.96.36.199/31 ip4:188.8.131.52/31 ~all"
It's worth pointing out for intrepid readers that there is a DNS SPF record type, however, it is deprecated in favor of using a TXT record instead. Weird, I know
Understanding Why SPF is Useful for OSINT
As a protocol built on openly sharing who can send e-mail on your domain's behalf, SPF can be a useful source for open-source intelligence (OSINT) gathering. Through a simple DNS query (e.g. dig google.com TXT) an attacker can quickly view your domain's SPF record to better understand the third-party service providers you may rely on. If you're not very familiar with SPF, you may be wondering why this is anymore useful than an MX record. While many domains may very well only send e-mail from the same server they receive it to, the age of cloud computing we find ourselves in has certainly changed that being the case for many organizations. Think about the dozens of services your company may leverage -- from Constant Contact, to Amazon Web Services, to Zendesk. Each of these organizations are likely to send e-mail for domain at some point and therefore should be listed as part of your SPF record if you want to help prevent e-mail from ending up in someone's spam folder.
It's important, however, to realize that e-mail can be relayed through an existing mail server in many cases, precluding the need for an organization add to their SPF record. Because of this, SPF as a source of OSINT is definitely not a perfect predictor of where all e-mail may be getting sent from, but it's certainly another valid source of OSINT data and can provide knowledge you'd otherwise never have.
Looking at the Moz Top 500 Domains for SPF Data
Having run across LinkedIn's SPF record one day, I noticed that they used DocuSign, an online document signing service that many businesses use to skip the hassle of fax machines for signatures. I thought this was rather interesting because it gave a really powerful piece of information about their organization. If I were doing a penetration test, I'd likely create a cloned version of DocuSign and try to phish employees to login to sign some important document. This is just the common sense approach any penetration tester (or criminal) would take with OSINT like this.
After pondering what SPF could mean for OSINT, I thought, "sure, let's do this on scale and see what we can find that may be interesting." This led me down a path of endless shell scripting, Unix tool use, and manual clean-up to DNS queries galore, trying to suss out data I thought was worth publishing in a blog post. By starting with the Moz Top 500 Domain list I figured I'd at least have a chance to get some bigger name sites who would be more likely to utilize SPF to ensure their e-mail delivery went well. As it turns out, around 225 out of 500 domains either didn't use an SPF record or their record only pointed directly back at their own hosting. It's worth noting that an SPF record can very well just point at the domain its self (A, PTR, MX, etc.) and not reference third-party senders at all. By removing these results, I quickly dropped down the number of domains to review.
I assure you, there's no glamor in dealing with the wildly-formatted hell that is a TXT record. While SPF syntax is well defined, the order of elements and sometimes recursive nature of an SPF record can lead to all sorts of parsing sadness. Oh, and once you get data, remember, a lot of it may be actual IP addresses or ranges, with no associated PTR records. This leads to having some hostnames (easy), some resolution of PTR records (less easy), and then WHOIS data for IP addresses or ranges not obviously attributable to a domain (sorrow). The combination of this data takes a lot of manual effort to find out what domains you want to look for, then combine possible grep values (e.g. Google net blocks, Postini hostnames, and Google hostnames are all GMail... probably). What we're left with is a mash-up of results that then need to be culled to find services worth examining.
Certain IP ranges may belong to a telecom provider who is only acting as the Internet link and may have little to do with the actual organization using it. To this end, I pulled out 30 companies that I think are fairly representative of the overall data set and represent over 300 examples of high-traffic web sites using third-party service providers to handle (at least in part) their e-mail delivery.
|98||acquirethisname.com, addthis.com, answers.com, bbb.org, berkeley.edu, bigcartel.com, blogger.com, bloglovin.com, boston.com, businessinsider.com, buzzfeed.com, canalblog.com, cbc.ca, cloudflare.com, cnet.com, creativecommons.org, devhub.com, deviantart.com, digg.com, disqus.com, dropbox.com, economist.com, edublogs.org, engadget.com, etsy.com, eventbrite.com, examiner.com, fastcompany.com, friendfeed.com, ft.com, github.com, globo.com, goodreads.com, hatena.ne.jp, huffingtonpost.com, ihg.com, indiegogo.com, intel.com, jalbum.net, jiathis.com, jimdo.com, kickstarter.com, last.fm, liveinternet.ru, mashable.com, mediafire.com, meetup.com, mozilla.com, mozilla.org, multiply.com, nationalgeographic.com, netlog.com, netvibes.com, networkadvertising.org, nps.gov, nytimes.com, opera.com, over-blog.com, pcworld.com, photobucket.com, pinterest.com, printfriendly.com, reddit.com, salon.com, scribd.com, sfgate.com, smh.com.au, smugmug.com, soundcloud.com, soup.io, spotify.com, squarespace.com, squidoo.com, studiopress.com, stumbleupon.com, techcrunch.com, technorati.com, ted.com, tinypic.com, tinyurl.com, topsy.com, tumblr.com, twitter.com, usgs.gov, ustream.tv, utexas.edu, vinaora.com, vk.com, vkontakte.ru, weather.com, weebly.com, wikia.com, wikimedia.org, wix.com, yale.edu, ycombinator.com, zdnet.com, zimbio.com|
|SendGrid||30||about.me, bandcamp.com, bloglovin.com, booking.com, boston.com, buzzfeed.com, cpanel.net, huffingtonpost.com, hugedomains.com, indiegogo.com, instagram.com, kickstarter.com, mapquest.com, mediafire.com, moonfruit.com, nytimes.com, nyu.edu, photobucket.com, pinterest.com, printfriendly.com, prlog.org, quantcast.com, sfgate.com, slideshare.net, springer.com, squarespace.com, storify.com, technorati.com, tumblr.com, wikia.com|
|Zendesk||Helpdesk||26||aol.com, biblegateway.com, cdbaby.com, cloudflare.com, constantcontact.com, dailymotion.com, devhub.com, deviantart.com, digg.com, dropbox.com, dyndns.org, github.com, indiegogo.com, jalbum.net, jimdo.com, kickstarter.com, moonfruit.com, nbcnews.com, ning.com, over-blog.com, photobucket.com, scribd.com, shop-pro.jp, wikia.com, woothemes.com|
|Microsoft||22||arstechnica.com, census.gov, dailymotion.com, discovery.com, fema.gov, foxnews.com, howstuffworks.com, ihg.com, istockphoto.com, mtv.com, newyorker.com, npr.org, pcworld.com, rakuten.co.jp, redcross.org, shutterfly.com, slate.com, unicef.org, utexas.edu, washingtonpost.com, wiley.com, wired.com|
|AWS SES||19||adobe.com, amazon.com, delicious.com, foo, foxnews.com, huffingtonpost.com, i2i.jp, ihg.com, instagram.com, mashable.com, moonfruit.com, npr.org, salon.com, smh.com.au, smugmug.com, topsy.com, washingtonpost.com, weather.com, wikispaces.com|
|Mandrill||16||addthis.com, cloudflare.com, delicious.com, jimdo.com, nationalgeographic.com, printfriendly.com, reverbnation.com, scribd.com, slate.com, smugmug.com, soundcloud.com, ted.com, themeforest.net, tinyurl.com, topsy.com, woothemes.com|
|Sailthru||Marketing||15||aol.com, behance.net, businessinsider.com, delicious.com, digg.com, economist.com, engadget.com, examiner.com, forbes.com, huffingtonpost.com, indiegogo.com, mashable.com, sfgate.com, slate.com, techcrunch.com|
|SoftLayer||Hosting||15||addthis.com, altervista.org, archive.org, dell.com, disqus.com, histats.com, hostgator.com, purevolume.com, scribd.com, slideshare.net, statcounter.com, studiopress.com, topsy.com, wikispaces.com, yelp.com|
|Rackspace||Hosting||11||cargocollective.com, eventbrite.com, huffingtonpost.com, mashable.com, nationalgeographic.com, npr.org, oakley.com, theatlantic.com, ustream.tv, wikispaces.com, zimbio.com|
|Mailgun||10||bigcartel.com, booking.com, deviantart.com, disqus.com, ft.com, imgur.com, ustream.tv, weebly.com, wix.com, ycombinator.com|
|Marketo||Marketing||10||blogtalkradio.com, dropbox.com, ft.com, latimes.com, moonfruit.com, quantcast.com, seattletimes.com, surveymonkey.com, ustream.tv, washingtonpost.com|
|AWS EC2||Hosting||8||adobe.com, constantcontact.com, foo, i2i.jp, multiply.com, tiny.cc, twitter.com, typepad.com|
|SalesForce||CRM||6||accuweather.com, bizjournals.com, constantcontact.com, lulu.com, spotify.com, surveymonkey.com|
|AuthSMTP||6||eventbrite.com, ft.com, nytimes.com, surveymonkey.com, weather.com, wikispaces.com|
|Proofpoint||5||instagram.com, nbcnews.com, newsvine.com, yellowpages.com|
|Peer 1||4||edublogs.org, squarespace.com, wordpress.com, wordpress.org|
|Blackbaud||CRM||3||berkeley.edu, redcross.org, washington.edu|
|Constant Contact||2||aol.com, foo|
|1&1 Internet||Hosting||2||1und1.de, artisteer.com|
Hopefully this data is interesting to a few of you. For those reading this that do penetration testing for a living, DNS is surely a big part of your reconnaissance phase, but perhaps SPF wasn't one of the areas you had looked deeply into prior. If your organization uses SPF, think about what this data could mean to your information security if leveraged by a skilled attacker. If your third-party allows for authenticated SMTP to provide mail handling, it may be worthwhile going that route rather than having them send e-mail on your behalf. Not only will this likely help to ensure e-mail deliverability further, but it reduces your exposure online.
If you're looking into implementing SPF, I've got to recommend you use a validator as it's very easy to screw-up your SPF record. Also, don't forget to create a Domain-based Message Authentication, Reporting & Conformance (DMARC) policy, which will help recipients determine what to do when your SPF record is at odds with the sender of e-mail on your domain's behalf. SPF doesn't have any inherent mechanism to do this, so DMARC will provide that information. Lastly, check out DomainKeys Identified Mail (DKIM), which also ties into DMARC and uses digital signatures to validate that e-mail came from your domain, also leveraging DNS as a trust anchor.