Last updated at Thu, 20 Jul 2017 17:43:27 GMT

Try this experiment. Go to your favorite search engine and type this:

”no evidence” security compromise

(Other variations are also interesting, including adding words like “breach”)

There is something about the phrase “no evidence” that troubles me. You may have noticed the same thing. On a regular basis organizations say that there is no evidence of compromise, and no evidence that attackers gained access to user/customer/employee data. They write these phrases to lessen the blow of what is surely a very hard time internally for the responders. They want to lessen the blow and be reassuring to those of us who worry if we've been impacted.

For me, it does the reverse. It makes me worry more because it tells me about the state of preparedness of that organization. The phrase “no evidence” could mean everything from “we have tons of evidence and we're sifting through it, but the probability of the attackers accessing your data is very low,” all the way to “we don't collect data for use by incident responders, so who knows?”

Simply put, absence of evidence is not evidence of absence. In fact, the term used in debates is “argument from ignorance." That's not very reassuring.

What phrases should we be reading instead? What would make us feel better about the investigation? We should be seeing statements like “We have evidence that there was no compromise.” Variations might include “We keep detailed logs of network, system, and user activity. Although the results are preliminary, we conclude that the attackers were not able to access your data.” Or maybe “We were able to trace the attackers activities over the past year, and understand the attack. They never had access to your data."

Now if you're reading this blog, and you know how attacks work, you know that those phrases are unlikely. More likely phrases by teams armed with evidence would be “We know which users were affected and are taking appropriate steps to notify them,” or “Attackers were able to access only the following data…” Even though those phases would indicate that the attackers accessed confidential data, it would be reassuring because it would allow appropriate action by all parties. Incident response teams would know how to re-assess their risk, and adjust technologies and processes. It would give customers/users the information needed to better protect themselves.

How does an organization get to the point where it can confidently and honestly say it had evidence? The bottom line is you have to assume you'll be breached. When you assume you will be breached, you'll behave differently than if you assume your defenses will be sufficient.

Collecting evidence required to make strong statements after a breach (or a suspected one!) and storing it for months or even years can be a challenge. How long should you keep pcaps, netflow data, and OS/application logs? How much would it cost? What about the security of all that data? I've talked to numerous teams who strongly assert it would be too expensive. But few can show me the spreadsheets mapping out the costs, assumptions, and a creative look at the task of gathering and storing data. And worse, it implies that they haven't given the executive stakeholders the the opportunity to make a business decision on the subject.

My suggestion is to have the debate. Don't look to show that it's not cost effective, but rather how you could define the problem statement to make it cost effective. What trade-offs would start to make the problem look solvable? What assumptions can you change to make it more feasible? For example, what if you didn't store SSL pcaps, but just the netflow data? What if you store netflow data much longer than full pcaps?

As you think about how the NIST Cybersecurity Framework considers the continuous functions of Identify, Protect, Detect, Respond, Recover, are you giving enough consideration to the latter functions? Are you building a cross-functional team to write breach runbooks, and to dry-run test them when you read about breaches in the press? Are you testing your detection capabilities?

In short, I'm hoping to see fewer blog posts that assume a lack of data is acceptable. It's not.

Have thoughts? Drop me a line on Twitter at @boblord.