September 13, 2017 by

The end of risk

blog-feature-image

This was originally posted on blogger here.

Introduction

I think risk is hurting infosec and may need to go.

First, a quick definition of risk:

Risk is the probable frequency and probable magnitude of future loss.
(Note: If this is not your definition of risk, the rest of the blog is probably going to make much less sense.  Unfortunately why this is the definition of risk is outside the scope of this blog so will have to wait.)

In practicality, risk is a way to measure security by measuring the likelihood and impact of something bad happening that could have been prevented by information security.

Outcomes

That last line is a bit nebulous though right?  Risk measures the opposite of what we're doing.  So let's better define what we're doing.  Let's call what we're doing an outcome: the end result of our structure and processes (for example, in healthcare, heart disease is a general term for negative cardiovascular outcomes).  Next, let's define what we want:
(A statistically insignificant and heavily biased survey. Obviously.  Why is this a good outcome? See the addendum.)

Measures, Markers, and Key Risk Indicators

Where we can directly measure this outcome, we call it a 'measure' (for example ejection fraction for heart disease) and life is a lot easier.  For risk we have to use surrogate markers (for example cholesterol for heart disease), sometimes called a Key Risk Indicators in risk terms.  Now when we say 'risk', we normally mean the indicators we use to predict risk.  The most well respected methodology is FAIR though if you are currently using an excel spreadsheet with a list of questions for your risks, you can easily improve simply by switching to Binary Risk Assessment.

The problems with risk.

The first problem with risk is not with risk per se, but with the surrogate markers we measure to predict risk.  In other fields such as medicine, before using a surrogate marker, there would be non-controversial studies linking the surrogate marker and the outcome.  In security, I'm not aware of any study which shows that the surrogate markers we measure to determine risk, actually predict the outcome in a holistic way.  In my opinion, there's a specific reason:
Because there are more legitimate targets than attackers, what determines an organization's outcome (at least on the attack side) is attacker choice.
You can think of it like shooting fish in a barrel.  Your infosec actions may take you out of the barrel, but you'll never truly know if you're out of the barrel or your in the barrel and just weren't targeted.   This, I think, is the major weakness in risk as a useful marker of outcomes.

This ignores multiple additional issues with risk such as interrelationships between risks, the impact of rational actors, and difficulty in capturing context, let alone problems in less mature risk processes that are solved in mature processes such as FAIR.

The second problem with risk is related to defense:
Risk does not explicitly help measure minimization of defense.  It tells us nothing about how to decrease (or minimally increase) the use of resources in infosec defense.   
I suspect we then implicitly apply an equation that Tim Clancy brought to my attention as foundational in the legal field: Burden < Cost of Injury × Probability of occurrence, (i.e. if Burden < Risk, pay the burden). It sounds good in theory, but is fraught with pitfalls.  The most obvious pitfall is that it doesn't scale.  Attacks are paths, except the paths are not in isolation.  At any given step, they can choose to go left or right, in effect creating more paths than can be counted.  As such, while one burden might be affordable, the sum of all burdens would bankrupt the organization.

What happens when markers aren't linked to outcomes?

As discussed in this thread, I think a major amount of infosec spending is socially driven.  Either the purchaser is making a purchase to signals success ("We only use the newest next gen products!"), signal inclusion in a group ("I'm a good CISO"), or is purchasing due to herd mentality ("Everyone else buys this so it must be the best option to buy").  Certainly, as we've shown above, the spending is not related to the outcome.  This begs the question of who sets the trends for the group or steers the herd.  I like Marcus Carey's suggestion: The analysts and Value Added Resellers. Maybe this is why we see so much marketing money in infosec.

The other major driver is likely other surrogate markers.  Infosec decision makers are starved for concrete truth, and so almost any number is clung to.  The unfortunate fact is that, like risk, most of these numbers have no demonstrable connection to the outcome.  Take hypothetical threat intelligence solution for example.  This solution includes all IPv4 addresses as indicators.  It has a 100% success rate in identifying threats.  (It also has a near 100% false positive rate.)  I suspect, with a few minor tweaks, it would be readily purchased even though it adds no value.

What can we do?

There are three questions we need to ask ourselves when evaluating metrics/measures/surrogate markers/KRI/however you refer to them.  From here on out, I'll refer to 'them' as 'metrics'.
  1. What is the outcome?
  2. Why is this the right outcome? (Is it actionable?)
  3. How do you know the metric is predicting this outcome?
For any metric you are considering using to base your security strategy on (the metric you use to make decisions of projects, purchases, etc with), you should be able to answer these three questions definitively.  (In fact, this blog answers the first question and part of the second above in the first section.)  I think there are at least three potential areas for future research that may yield acceptable metrics.

Operational metrics

I believe operational metrics have a lot of potential.  They are easy to collect with a SEIM.  The are actionable.  They can directly predict the outcome above.  ("The ideal outcome of infosec is minimizing infosec, attack and defense.")  Our response process:

  1. Prevent
  2. Detect
  3. Respond
  4. Recover

should minimize infosec.  With that in mind we can measure it:

  • Absolute count of detections. (Should go down with mitigations.)
  • Time to detect (Should go down with improved detection.) (Technically, absolute count of detections should go up with improved detection as well, but should also be correlated with an improved time to detect)
  • Percent detected in under time T.  (Where time T is set such that, above time T, the attack likely succeeded.)
  • Percent responded to. (Depending on the classification of the incidents, this can tell you both how much time you are wasting responding to false positives and what portion of true attacks are are resolving.)
  • Time to respond. (Goal of responding in under T where T represents time necessary for the attack to succeed.)
  • Successful response. (How many attacks are you preventing from having an impact)
The above metrics are loosely based on those Sandia National Labs uses in physical security assessments.  You can capture additional resource-oriented metrics:
  • Time spent on attacks by type.  (This can help identify where your resources are being spent so you can prioritize projects to improve resource utilization)
  • Recovery resources used. (This can help assess the impact of failure in the Detect and Respond metrics.)
  • Metrics on escalated incidents. (Time spent at tier 1, type, etc.  This may suggest projects to minimize tier 2 use and, therefore, overall resource utilization.)

Combine this with data from infosec projects and other measurable infosec resource costs and the impact of infosec on the organization (both attack and defense) can be measured.

Relative risk

Risk has a lot of good qualities.  One way to get around it's pitfalls may be to not track risk in absolute terms (probability of impact size in FAIR's case), but in relative terms.  Unfortunately, it removes the ability to give the business a single, "this is your risk" score except in terms relative to other organizations.  But relative may be enough,.  For implementing a security strategy where the goal is to pick the most beneficial course of action, relative risk may be enough to choose a course of action.  The actions can even have defensive costs associated with them as well.  The problem is that the defensive costs and the relative risk are not in the same units, making it hard to understand if purchasing a course of action is a net benefit.

Attacker cost

Finally, I think attacker cost may be a worthwhile area of research.  However, I don't think it is an area that has been well explored.  As such, a connection between maximizing attacker cost and "minimizing infosec" (from our outcome) has not been demonstrated.  I suspect a qualified economist could easily show that as attacker costs go up, some attackers will be priced out of the market, and those that still could afford to attack, will choose less expensive sources to fulfill their needs.  However a qualified economist, I am not.  Second, I don't know that we have a validated way to measure attack 'cost'.  It makes intuitive sense that we could estimate these costs.  (We know how they are done and, as such, can estimate what it would cost us to accomplish the attack and any differences between us and attackers.) But before this is accepted, academic research in pricing attacks will be necessary.

Conclusion

So, from this blog, I want you to take a two things:
  1. The fact that attackers pick targets means no-one really knows if their mitigations make them secure.
  2. Three easy questions to ask when looking at metrics to guide your organization's security strategy
With a well-defined outcome, and good metric(s) to support it, you can truly build a data-driven security strategy.  But that's a talk for another day.

Addendum

Why is this the right outcome?  Good question.  It captures multiple things at once.  Breaches can be considered a cost associated with infosec and so are captured.  However, it'd be naive to think that all costs attackers cause are associated with breaches (or even incidents).  The generality of the definition allows it to be inclusive.  It also captures the flip side: the goal of minimizing defenses. This is easy to miss, but critical to organizations.  There is no benefit to infosec if the cost of stopping attacks is worse than the cost of the attacks.  Ideally, stopping attacks would have zero cost of resources (though that is practically impossible).  This outcome is also vague about the unit to be minimized allowing flexibility.  (It doesn't say 'minimize cost' or 'minimize time'.) Ultimately it's up to the organization to measure this outcome.  How they choose to do so will determine their success.