How to Interpret Sensitivity Analysis Results

Sensitivity analysis tells you how much your conclusion depends on the assumptions you fed into your model. Interpreting it comes down to one core question: if your inputs were wrong, would your decision change? When the answer is no, your results are robust. When the answer is yes, you’ve found the assumptions that matter most and need the closest scrutiny.

This applies whether you’re building a financial forecast in Excel, evaluating a health intervention, or running any model where uncertain inputs drive a final recommendation. The techniques vary, but the interpretation logic is consistent across fields.

What Sensitivity Analysis Actually Shows You

Every model takes inputs (costs, probabilities, rates of return, effect sizes) and produces an output (a recommendation, a ranking, a bottom-line number). Sensitivity analysis systematically changes those inputs to see how the output responds. The goal isn’t to find the “right” answer. It’s to understand which inputs have the power to change your answer and how far they’d need to shift before that happens.

Think of it as stress-testing. You’re asking: what if my cost estimate is 20% too low? What if the success rate is half of what I assumed? If the recommendation holds up across a wide range of plausible values, you can trust it. If a small tweak to one assumption flips the entire conclusion, that assumption deserves more attention, better data, or at minimum a prominent disclaimer.

Reading a One-Way Sensitivity Analysis

The simplest form varies one input at a time while holding everything else constant. You pick a parameter, move it across a plausible range, and watch what happens to the output. The interpretation is straightforward: if the output barely moves, that parameter isn’t driving your results. If the output swings dramatically, it is.

For example, in a cost-effectiveness model, you might vary the price of a treatment from its lowest plausible value to its highest. If the model says the treatment is worth funding across that entire range, the price uncertainty doesn’t threaten your conclusion. But if the recommendation flips from “fund it” to “don’t fund it” partway through the range, you’ve identified a critical vulnerability. The specific value where the recommendation changes is called the threshold, and it’s the single most important number in a one-way analysis.

One-way analysis has a significant limitation: it assumes inputs are independent. In reality, if one cost goes up, related costs often follow. So while one-way results are easy to communicate, they can understate the true uncertainty when parameters are correlated.

How to Read a Tornado Diagram

Tornado diagrams are the standard visual output of one-way sensitivity analysis, and they’re designed to answer one question at a glance: which inputs matter most?

Each horizontal bar represents one input parameter. The width of the bar shows how much the output changes when that parameter moves across its plausible range. Bars are stacked vertically in order of influence, with the widest bar (the most influential parameter) at the top and the narrowest at the bottom. The resulting shape looks like a tornado, widest at the top and tapering down.

When interpreting a tornado diagram, focus on three things. First, look at the top two or three bars. These are the parameters that drive your results, and they’re where you should invest effort in getting better estimates. Second, check whether any bar crosses a decision threshold (often shown as a vertical reference line). A bar that crosses the threshold means that parameter alone, at a plausible value, could change your conclusion. Third, look at the bottom of the diagram. Parameters with tiny bars are essentially irrelevant to your decision, even if they felt important when you were building the model.

One thing tornado diagrams don’t show is probability. A bar might be wide, indicating a large potential impact, but the extreme values at the edges of that bar might be very unlikely. Some analysts supplement tornado diagrams with line graphs that plot the cumulative probability of each parameter value against the output, giving you a sense of not just what could happen but how likely it is.

Interpreting Threshold Analysis

Threshold analysis answers the most practical question in sensitivity analysis: how much would the evidence have to change before your recommendation changes? The result is a specific number, or set of numbers, that represent tipping points.

You interpret thresholds by asking whether crossing them is plausible. If your model recommends Treatment A over Treatment B, and threshold analysis shows the recommendation would only flip if the effectiveness of Treatment A dropped by 60%, that’s a robust finding. A 60% error in your effectiveness estimate is hard to imagine. But if the threshold is a 5% change, your conclusion is fragile, because measurement error alone could push you past that tipping point.

Thresholds are often visualized as “invariant intervals,” which are shaded ranges around a point estimate. As long as the true value falls within the shaded range, the recommendation holds. Values outside the interval would trigger a different decision. When a confidence interval overlaps with a threshold, that’s a signal the data alone can’t reliably support the recommendation.

Probabilistic Sensitivity Analysis

Instead of moving one parameter at a time, probabilistic sensitivity analysis (PSA) varies all parameters simultaneously by drawing random values from their probability distributions. The model runs thousands of times, each with a different combination of inputs, producing a cloud of possible outcomes.

Interpreting PSA results typically involves two visual tools. The first is a scatter plot where each dot represents one simulation run. In health economics, the axes are usually incremental cost (vertical) and incremental effectiveness (horizontal). If most dots cluster in one quadrant, the conclusion is relatively stable. If they scatter across multiple quadrants, there’s genuine uncertainty about whether the intervention is worth the cost.

The second tool is a cost-effectiveness acceptability curve, which plots the probability that an intervention is cost-effective at different willingness-to-pay thresholds. Reading it is simple: pick the threshold relevant to your context (for instance, a commonly used benchmark per unit of health gained), then read across to see the probability. A probability of 90% at your threshold means 90% of simulations found the intervention to be good value. A probability of 55% means it’s essentially a coin flip, and the decision could go either way.

Parameter Uncertainty vs. Structural Uncertainty

Most sensitivity analyses address parameter uncertainty: we know the model’s structure is correct, but we’re unsure about the exact values plugged into it. This is the uncertainty you capture with tornado diagrams and probabilistic methods.

Structural uncertainty is different and harder to handle. It asks whether the model itself is built correctly. Maybe you assumed a linear relationship when the real one is curved. Maybe you left out a variable that matters. Structural uncertainty is typically explored by running the analysis under several alternative model designs and comparing the results. If conclusions hold across different model structures, that’s strong evidence of robustness.

In a Bayesian framework, different model structures can be formally weighted by how well they predict the data, producing averaged results that account for the possibility that any single model is wrong. But in most practical settings, structural uncertainty is handled less formally: analysts present results under two or three alternative assumptions and let the decision-maker judge which is most believable.

Judging Whether Results Are Robust

Robustness isn’t binary. It’s a spectrum, and interpreting it requires judgment. Here are the practical signals to look for:

  • No single parameter flips the decision. If you can remove or change any one input and the ranking of options stays the same, the result is robust from a one-way perspective. Research on multi-criteria decision models has shown that when no single criterion deletion changes the overall ranking, results are considered very robust.
  • Thresholds are far from plausible values. The wider the gap between your best estimate and the tipping point, the more confidence you can place in the conclusion.
  • Probabilistic results cluster tightly. In PSA, a tight cluster of outcomes means the conclusion is stable even when everything varies at once. Wide scatter means genuine ambiguity.
  • Results hold across model structures. If three different reasonable ways of building the model all point to the same decision, structural uncertainty isn’t a major concern.

Common Mistakes in Interpretation

The most frequent error is treating sensitivity analysis as a formality rather than a decision tool. Running the analysis and reporting it in an appendix misses the point. The value lies in changing how confident you are in your conclusion and identifying where better data would most reduce your uncertainty.

A second mistake is using unrealistically narrow ranges for your inputs. If you only vary a cost estimate by plus or minus 5% when the real uncertainty is plus or minus 50%, your tornado diagram will look reassuringly calm, but it’s meaningless. The ranges you choose should reflect genuine uncertainty, not optimistic guesses.

Third, people often over-interpret one-way results when parameters are correlated. If two costs tend to rise together, varying them independently understates the risk. Probabilistic analysis handles this better, especially when correlations between parameters are explicitly modeled.

Finally, in health economics and policy contexts, interpreting cost-effectiveness ratios requires a reference point. An incremental cost-effectiveness ratio on its own is just a number. It only becomes meaningful when compared to a willingness-to-pay threshold, which represents how much the decision-maker is willing to spend per unit of health gained. A ratio below the threshold signals good value. A ratio above it does not. Sensitivity analysis should show whether the ratio stays on the same side of that threshold across plausible input ranges.