How to Interpret Standard Deviation: What It Tells You

Standard deviation tells you how spread out a set of numbers is from the average. A small standard deviation means most values cluster tightly around the mean, while a large one means they’re scattered more widely. Once you understand what those numbers actually represent, you can use them to spot outliers, read medical test results, and make sense of error bars on graphs.

What Standard Deviation Actually Measures

Think of the mean as the center of your data. Standard deviation measures the typical distance each data point sits from that center. If you surveyed 100 people about their commute time and got a mean of 30 minutes with a standard deviation of 5 minutes, most people’s commutes fall relatively close to 30 minutes. If the standard deviation were 20 minutes, the same average would hide a much wider range of experiences.

One practical advantage: standard deviation is always expressed in the same units as the original data. A standard deviation of 5 minutes means minutes, not some abstract statistical unit. This makes it immediately useful for understanding real quantities like test scores, weights, blood pressure readings, or financial returns.

A standard deviation close to zero means nearly every value in the dataset is almost identical. As the number grows, so does the gap between your highest and lowest values. There is no universal threshold for “high” or “low” because it depends entirely on context. A standard deviation of 2 pounds is tiny when measuring adult body weight but enormous when measuring doses of a medication.

The 68-95-99.7 Rule

When data follow a bell-shaped (normal) distribution, standard deviation becomes especially powerful because of a pattern called the empirical rule:

68% of all values fall within 1 standard deviation of the mean.
95% fall within 2 standard deviations.
99.7% fall within 3 standard deviations.

Here’s what that looks like in practice. Say a class of students has a mean exam score of 75 with a standard deviation of 10. About 68% of students scored between 65 and 85. Roughly 95% scored between 55 and 95. And nearly everyone, 99.7%, scored between 45 and 105. If someone scored a 40, you’d immediately know that result is unusually far from the pack.

This rule only applies cleanly to data that are roughly symmetrical and bell-shaped. Skewed data, like household income or insurance claims, don’t follow these neat percentages. But for many biological measurements, test scores, and manufacturing tolerances, the rule is a reliable shortcut.

Turning a Data Point Into a Z-Score

A z-score translates any individual value into “how many standard deviations away from the mean is this?” The formula is simple: subtract the mean from the data point, then divide by the standard deviation.

If the average height in a group is 68 inches with a standard deviation of 4 inches, and you’re 72 inches tall, your z-score is (72 − 68) / 4 = 1.0. You’re exactly one standard deviation above the mean. A z-score of −2 would mean someone is two standard deviations below the mean, placing them in roughly the bottom 2.5% of the distribution.

Z-scores are especially useful when comparing values from completely different scales. Scoring 1.5 standard deviations above the mean on a math test and 0.8 standard deviations above the mean on a reading test gives you a direct comparison, even though the tests had different point totals and different averages.

How Medical Reference Ranges Work

When you get blood work back and see a “normal range,” you’re looking at standard deviation in action. Historically, labs established reference intervals by testing a large group of healthy people, calculating the mean and standard deviation, and then defining “normal” as the central 95% of results. That central 95% corresponds to roughly 2 standard deviations above and below the mean.

This means about 5% of perfectly healthy people will have a result that falls outside the reference range, simply because that’s how the boundaries were drawn. A value just outside the range isn’t necessarily a sign of disease. It’s one reason doctors look at patterns across multiple tests rather than reacting to a single borderline number.

Spotting Outliers With the 3-Sigma Rule

The most common rule of thumb for identifying outliers is the “3-sigma rule”: any data point more than 3 standard deviations from the mean is considered unusual enough to flag. In a normal distribution, only about 0.3% of data points should fall that far out. When one does, it often signals a measurement error, a data entry mistake, or a genuinely rare observation worth investigating.

Suppose a factory produces bolts with a mean length of 50 mm and a standard deviation of 0.5 mm. A bolt measuring 52 mm sits 4 standard deviations above the mean. That’s well beyond the 3-sigma threshold and would trigger a quality check. The same logic applies in finance, where a daily stock return 3 or more standard deviations from the historical mean is often called a “black swan” event.

Standard Deviation vs. Standard Error

These two terms sound similar but answer different questions. Standard deviation describes how spread out individual data points are. Standard error of the mean describes how confident you can be in the average itself.

Imagine you measure the blood pressure of 25 patients and calculate a mean of 120 with a standard deviation of 15. The SD of 15 tells you individual patients vary quite a bit. The standard error (which shrinks as your sample gets larger) tells you how close your sample’s average of 120 is likely to be to the true average for the entire population. If you measured 250 patients instead of 25, the standard deviation would stay roughly the same because individual variation doesn’t change, but the standard error would shrink because you’d have a much more precise estimate of the true mean.

This distinction matters when you’re reading research. If a study reports error bars on a graph, check whether they represent SD or SE. SD bars show you the spread of actual data and will always be wider. SE bars and 95% confidence intervals are inferential tools, meaning they help you judge whether differences between groups are statistically meaningful. For samples of 10 or more, doubling the standard error gives you an approximate 95% confidence interval.

Reading Error Bars on Graphs

Error bars are the thin lines extending above and below data points on charts, and they can represent several different things: standard deviation, standard error, confidence intervals, or even raw range. Without checking the figure legend, you can’t know which one you’re looking at.

SD error bars are descriptive. They show you roughly where two-thirds of the data fall (within ±1 SD of the mean) and give a sense of how variable the measurements are. If you see wide SD bars, the individual data points are spread out. If they’re narrow, the data are consistent.

Confidence interval and SE bars are inferential. They help you judge whether two groups are truly different or whether their overlap could be due to chance. In experimental research, CI and SE bars are generally more informative because the goal is usually to compare groups rather than just describe variation. When reading a paper or news article that includes a graph, look for the label. If it says “±SD,” you’re seeing data spread. If it says “±SE” or “95% CI,” you’re seeing a measure of precision around the estimated mean.

Practical Tips for Interpreting Any Standard Deviation

Context always matters more than the raw number. A standard deviation of 10 could be enormous or trivial depending on what’s being measured and what the mean is. One useful shortcut is to compare the SD to the mean as a ratio. This is called the coefficient of variation. If the mean is 200 and the SD is 10 (a 5% ratio), the data are fairly consistent. If the mean is 20 and the SD is 10 (a 50% ratio), there’s enormous relative spread.

When comparing two datasets, the one with the larger standard deviation has more variability, even if the means are identical. A mutual fund with an average annual return of 8% and an SD of 3% is far more predictable than one with the same 8% average but an SD of 12%. The second fund had years of great gains and painful losses that averaged out to the same number.

Finally, remember that standard deviation assumes your data are reasonably symmetrical. For heavily skewed distributions, the median and interquartile range often give a better picture than the mean and SD. If you see a dataset where the standard deviation is larger than the mean itself, that’s a strong hint the data may be skewed or contain extreme values worth examining more closely.