To measure standard deviation, you find the average of your data, calculate how far each value falls from that average, square those differences, average the squared differences, and take the square root. The result tells you how spread out your data is, expressed in the same units as your original measurements. A small standard deviation means values cluster tightly around the average; a large one means they’re scattered widely.
The Five Steps, Start to Finish
Let’s walk through the calculation with a small dataset: 4, 8, 6, 5, 7. These five numbers will make each step concrete.
Step 1: Find the mean. Add all values and divide by the count. For our data: (4 + 8 + 6 + 5 + 7) ÷ 5 = 30 ÷ 5 = 6. The mean is 6.
Step 2: Subtract the mean from each value. This gives you the “deviation” of each data point. Our deviations are: 4 − 6 = −2, 8 − 6 = 2, 6 − 6 = 0, 5 − 6 = −1, 7 − 6 = 1. Notice these deviations always add up to zero, which is why you can’t simply average them to measure spread.
Step 3: Square each deviation. Squaring eliminates the negative signs and gives extra weight to values far from the mean. Our squared deviations: 4, 4, 0, 1, 1.
Step 4: Average the squared deviations. This average is called the variance. Add the squared deviations (4 + 4 + 0 + 1 + 1 = 10) and divide. If these five numbers are your entire dataset, divide by 5 to get a variance of 2. If they’re a sample drawn from a larger group, divide by 4 (one fewer than the count) to get a variance of 2.5. More on that distinction below.
Step 5: Take the square root. The square root of the variance is the standard deviation. For the population version: √2 ≈ 1.41. For the sample version: √2.5 ≈ 1.58.
Population vs. Sample: When to Use Each Formula
The only difference between the two formulas is what you divide by in Step 4. Population standard deviation divides by N (the total number of data points). Sample standard deviation divides by N − 1.
Use the population formula when your dataset includes every member of the group you care about. If you measured the heights of all 30 students in a class and you only care about that class, you have the full population. Divide by 30.
Use the sample formula when your data is a subset drawn from a bigger group. If you measured 30 students to estimate the spread of heights across the entire school, that’s a sample. Divide by 29. The reason: when you use the sample’s own average to calculate deviations, those deviations are slightly constrained. There are only N − 1 truly independent deviations because they always sum to zero. Dividing by N would systematically underestimate the real spread in the larger population. Dividing by N − 1 corrects for that bias. This adjustment is called Bessel’s correction, and it matters most with small samples. As your sample grows into the hundreds or thousands, the difference between dividing by N and N − 1 becomes negligible.
Why Square and Then Square Root?
This is the part that trips people up. Why not just average the raw deviations? Because positive and negative deviations cancel each other out, always giving you zero. Squaring solves that problem by making every deviation positive. The intermediate result, variance, is useful in its own right for many statistical calculations, but it’s expressed in squared units. If your data is in centimeters, the variance is in square centimeters, which is hard to interpret. Taking the square root at the end brings you back to the original units, giving you a number you can directly compare to your mean and your data.
What Your Result Actually Tells You
Standard deviation is most intuitive when your data follows a roughly bell-shaped (normal) distribution. In that case, a pattern called the empirical rule applies:
- About 68% of values fall within one standard deviation of the mean
- About 95% fall within two standard deviations
- About 99.7% fall within three standard deviations
So if the average test score in a class is 75 with a standard deviation of 10, roughly two-thirds of students scored between 65 and 85, and nearly all scored between 45 and 105. A standard deviation close to zero means almost everyone scored near 75. A standard deviation of 20 would mean scores were all over the map.
Whether a given standard deviation counts as “large” or “small” depends entirely on context. A standard deviation of 5 is tiny if your mean is 500 but enormous if your mean is 6. Comparing the standard deviation to the mean gives you a sense of relative spread. That ratio (standard deviation divided by the mean) is sometimes called the coefficient of variation, and it lets you compare variability across datasets measured on different scales.
Calculating on a Scientific Calculator
You don’t have to do this by hand every time. Most scientific calculators have a statistics mode that handles the computation for you. On Casio’s ClassWiz models, for example, you enter Statistics mode, select 1-Variable, type each data value into its own row, and press the execute key. The results screen displays both the population and sample standard deviations so you can choose the one you need.
Graphing calculators follow a similar flow: enter your data into a list, run a summary statistics calculation, and read the output. The population standard deviation is typically labeled σ (sigma), while the sample standard deviation is labeled s or sx. If you’re working in a spreadsheet, most programs offer two functions: one for population (like STDEV.P in Excel or Google Sheets) and one for sample (STDEV.S). The sample version is usually the default because most real-world data is a sample rather than a complete population.
Standard Deviation vs. Standard Error
These two are easy to confuse, but they answer different questions. Standard deviation describes how spread out individual data points are in your sample. If you measured the blood pressure of 100 people, the standard deviation tells you how much those 100 readings varied from person to person.
Standard error, on the other hand, tells you how precisely your sample average estimates the true average of the whole population. It equals the standard deviation divided by the square root of your sample size, so it shrinks as you collect more data. You’d report standard deviation when describing the variability within your group, and standard error when making a claim about how close your sample mean is to the population mean. In published research, you’ll see both: standard deviation to characterize the sample (average age of patients, average tumor size) and standard error or confidence intervals to indicate how reliable those estimates are for the broader population.
A Quick Reference Example
Suppose you’re tracking how many minutes you exercise each day for a week: 30, 45, 25, 50, 35, 40, 35. Here’s the full calculation using the population formula, since this is your complete week.
Mean: (30 + 45 + 25 + 50 + 35 + 40 + 35) ÷ 7 = 260 ÷ 7 ≈ 37.14 minutes. Deviations from the mean: −7.14, 7.86, −12.14, 12.86, −2.14, 2.86, −2.14. Squared deviations: 51.0, 61.8, 147.4, 165.4, 4.6, 8.2, 4.6. Sum of squared deviations: 443.0. Variance: 443.0 ÷ 7 ≈ 63.3. Standard deviation: √63.3 ≈ 7.95 minutes.
Your average workout lasted about 37 minutes, and the typical day deviated from that average by roughly 8 minutes in either direction. That’s a moderate amount of variability. If you wanted every day to be more consistent, you’d aim to bring that standard deviation down.

