What Is a Tolerance Interval in Statistics?

A tolerance interval is a range of values calculated from sample data that captures a specified proportion of an entire population with a stated level of confidence. Unlike more familiar statistical intervals, a tolerance interval requires you to set two parameters: the percentage of the population you want to cover (say, 95%) and how confident you want to be that your interval actually achieves that coverage (say, 90%). A common shorthand is to call this a “95/90 tolerance interval,” meaning it contains at least 95% of the population with 90% confidence.

The Two Parameters That Define It

Every tolerance interval is built around two inputs that you choose before calculating anything. The first is the coverage proportion, often labeled P. This is the minimum percentage of the population you want the interval to contain, such as 95% or 99%. The second is the confidence level, often labeled γ or α, which reflects how sure you want to be that your interval truly captures that proportion. A higher confidence level makes the interval wider because you’re demanding more assurance.

This dual-parameter structure is what makes tolerance intervals unique. A 95/95 tolerance interval, for example, is designed so that at least 95% of the population falls within its bounds, and you can be 95% confident that’s actually the case. Raise either number and the interval gets wider. This is the tradeoff: more coverage or more confidence costs you precision, especially with smaller samples.

How It Differs From Confidence and Prediction Intervals

Tolerance intervals are often confused with confidence intervals and prediction intervals because all three are calculated from sample data. But they answer fundamentally different questions.

  • Confidence interval: Estimates where a population parameter (like the true average) lies. It describes the uncertainty around a single number, not the spread of individual values.
  • Prediction interval: Estimates the range where one or a small number of future observations will fall. It’s useful when you care about the next data point, not the whole population.
  • Tolerance interval: Estimates a range that contains a specified proportion of the entire population. It’s the right tool when you need to make a statement about where most values in a population sit.

The practical distinction matters. If you want to know the average tablet weight in a pharmaceutical batch, use a confidence interval. If you want to predict what the next tablet off the line will weigh, use a prediction interval. If you want to guarantee that 99% of all tablets fall within a certain weight range, use a tolerance interval. Prediction intervals work well for forecasting small numbers of future observations (fewer than about 100), while tolerance intervals are designed for characterizing the bulk of a population or a very large number of future values.

How a Tolerance Interval Is Calculated

For data that follows a normal (bell-curve) distribution, a two-sided tolerance interval takes the form: sample mean ± k × standard deviation. The entire challenge lies in computing the right value of k, called the tolerance factor. This factor depends on three things: the desired coverage proportion, the confidence level, and your sample size.

With a large sample, k shrinks because your estimates of the mean and standard deviation are more precise. With a small sample, k grows, sometimes dramatically. For instance, with a sample of 43 measurements, a particular one-sided tolerance factor might be about 4.41. Drop the sample to just 6 measurements and the factor balloons to somewhere between 4.41 and 5.28, depending on which calculation method you use. The exact formulas involve chi-square and normal distribution values, but statistical software handles this automatically. The key insight is that small samples produce very wide tolerance intervals because there’s so much uncertainty about the true shape of the population.

For sample sizes above about 10, different calculation methods produce nearly identical results. Below 10, the method matters, and approaches based on what’s called the non-central t-distribution tend to be more accurate.

When Your Data Isn’t Normal

Not all data follows a bell curve. When you can’t assume a specific distribution, nonparametric (distribution-free) tolerance intervals are available. These rely on the ranked values in your sample rather than on means and standard deviations.

For a one-sided nonparametric tolerance interval, the approach is surprisingly simple: the largest value in your sample serves as the upper tolerance limit (or the smallest for a lower limit). The catch is that you need enough data. The required sample size is determined by the formula n = ln(α) / ln(P), where α is the acceptable probability of being wrong and P is the desired coverage. For a 95/95 one-sided tolerance bound, this works out to about 59 measurements. For 99/95, you’d need about 299. Nonparametric methods are flexible but data-hungry.

Two-sided nonparametric tolerance intervals work similarly, using specific ranked values from your sample as the lower and upper bounds. The exact positions of those ranked values are chosen so the interval meets your coverage and confidence requirements.

One-Sided vs. Two-Sided Intervals

A two-sided tolerance interval has both a lower and upper bound, capturing the middle bulk of the population. This is what you’d use to establish, for example, the normal range for a blood test result or the acceptable weight range for a manufactured product.

A one-sided tolerance interval (or tolerance bound) sets only a single limit. An upper tolerance bound might answer: “What value will 95% of the population fall below?” This is common in environmental monitoring, where you care about an upper contamination threshold but not a lower one, or in engineering, where a component’s strength must exceed a minimum but there’s no upper concern.

One-sided intervals are narrower than two-sided ones for the same coverage and confidence, because they concentrate all the coverage in one direction.

Practical Applications

Manufacturing and Quality Control

The U.S. Food and Drug Administration uses tolerance intervals for pharmaceutical quality assessment. When testing dose content uniformity, delivered dose uniformity, or dissolution rates, a tolerance interval can verify that a specified percentage of all units in a batch meet specifications with a given confidence. Older methods based on simple reference intervals (mean ± some multiple of the standard deviation) couldn’t guarantee a particular confidence level for passing specifications. Tolerance intervals solve that by explicitly building confidence into the calculation.

Clinical Reference Ranges

Medical reference ranges, like “normal” cholesterol or blood glucose levels, are a natural application for tolerance intervals. The goal is to define a range that contains a large proportion (typically 95%) of the healthy population. A tolerance interval is the statistically appropriate tool because it accounts for the fact that the reference range is estimated from a limited sample of healthy individuals. The confidence level tells you how reliable that estimate is.

Method Comparison Studies

When evaluating whether two clinical measurement methods can be used interchangeably, tolerance intervals help assess the spread of differences between the methods. Two methods are considered interchangeable if 95% of their individual differences fall within clinically acceptable limits. A tolerance interval with, say, 80% or 95% confidence provides a principled way to make that judgment. The higher the confidence level you demand, the wider the interval becomes, reflecting a more conservative assessment.

Why Tolerance Intervals Are Underused

Multiple research groups have noted that tolerance intervals are applied far less often than they should be. In many published studies, confidence intervals or prediction intervals are used in situations where a tolerance interval would be more appropriate. Any time the goal is to characterize the range of typical values in a population, rather than estimate a single parameter or predict a handful of future observations, a tolerance interval is the correct choice. The likely reason for underuse is simply familiarity: most introductory statistics courses cover confidence and prediction intervals extensively but barely mention tolerance intervals.