Population variance measures how spread out the values in an entire dataset are from the average. You calculate it by finding the mean of your data, measuring how far each value sits from that mean, squaring those distances, and then averaging the squared distances. The result is represented by the symbol σ² (sigma squared), and the formula looks like this: σ² = Σ(xᵢ − μ)² / N, where μ is the population mean and N is the total number of values.
The Formula and What Each Part Means
The population variance formula has four moving parts. Understanding each one makes the calculation straightforward rather than intimidating.
- xᵢ: Any individual value in your dataset.
- μ (mu): The population mean, or the average of every value.
- xᵢ − μ: The deviation, which is how far a single value sits from the mean. Some deviations will be negative (below average) and some positive (above average).
- N: The total number of values in the population.
You square each deviation before averaging because positive and negative deviations would cancel each other out if you simply added them. Squaring ensures every distance counts, regardless of direction. Then dividing by N gives you the average squared deviation across the entire population.
Step-by-Step Calculation
Suppose you have five exam scores that represent an entire class: 70, 80, 85, 90, 95. Here’s how to find the population variance.
Step 1: Find the Mean
Add all values and divide by the number of values. (70 + 80 + 85 + 90 + 95) / 5 = 420 / 5 = 84. So μ = 84.
Step 2: Subtract the Mean From Each Value
This gives you the deviation for each data point: 70 − 84 = −14, 80 − 84 = −4, 85 − 84 = 1, 90 − 84 = 6, 95 − 84 = 11.
Step 3: Square Each Deviation
(−14)² = 196, (−4)² = 16, (1)² = 1, (6)² = 36, (11)² = 121.
Step 4: Add the Squared Deviations
196 + 16 + 1 + 36 + 121 = 370.
Step 5: Divide by N
370 / 5 = 74. The population variance is 74.
A Shortcut Formula for Larger Datasets
When you’re working by hand with many values, tracking every single deviation gets tedious. There’s an equivalent formula that skips the subtraction step entirely: σ² = (Σx²) / N − μ². In plain terms, you square each original value, average those squares, and then subtract the square of the mean. This produces the exact same result but with less bookkeeping.
Using the same five scores: square each value (4900, 6400, 7225, 8100, 9025), sum them (35650), divide by 5 (7130), and subtract the mean squared (84² = 7056). That gives 7130 − 7056 = 74, matching the answer from the longer method.
Population Variance vs. Sample Variance
The critical distinction is in the denominator. Population variance divides by N, the total count. Sample variance divides by N − 1. This difference exists because a sample drawn from a larger group tends to underestimate the true spread. Dividing by N − 1 corrects for that bias.
If your dataset includes every single member of the group you care about (every student in a class, every product in a warehouse, every game in a season), use N. If your dataset is a subset pulled from a bigger group to estimate that group’s characteristics, use N − 1. Getting this wrong is one of the most common mistakes in introductory statistics, and it will consistently give you a slightly off answer.
The notation reflects this split. Population variance uses σ², and the population mean uses μ (both Greek letters). Sample variance uses s², and the sample mean uses x̄ (x-bar). Greek letters signal population parameters; Latin letters signal sample statistics. If you see σ² in a problem, the question is asking for population variance and you should divide by N.
Calculating Population Variance in Excel
Excel has a dedicated function for population variance: VAR.P. If your data sits in cells A1 through A20, you’d type =VAR.P(A1:A20) and get the population variance instantly. The companion function VAR.S (or simply VAR) calculates sample variance with the N − 1 denominator, so make sure you pick the right one. Google Sheets uses the same function names.
In Python, the NumPy library’s var() function defaults to population variance. In R, the built-in var() function defaults to sample variance, so you’d need to adjust manually or multiply by (N − 1) / N to convert.
Why Variance Uses Squared Units
Because you square each deviation, the result is expressed in squared units. If your original data is in inches, the variance is in square inches. If it’s in dollars, the variance is in dollars squared. This makes variance hard to interpret on its own, since “square dollars” doesn’t mean much in everyday terms.
That’s why standard deviation exists. It’s simply the square root of the variance, which brings the units back to the original scale. For the exam scores above, the population standard deviation would be √74 ≈ 8.6 points, a number you can directly compare to the scores themselves. Variance is more useful in mathematical proofs and advanced calculations, while standard deviation is more useful for describing data in practical terms. Both carry the same information, just in different forms.
Common Mistakes to Avoid
Forgetting to square the deviations is the most frequent error. Without squaring, the positive and negative deviations sum to zero every time, giving you a meaningless result. The second most common mistake is dividing by N − 1 when the problem calls for population variance (or vice versa). Read the problem carefully: if you’re told the data represents an entire population, divide by N.
Another subtle trap shows up when using the shortcut formula. If you round the mean before squaring it, rounding errors compound and your final answer drifts from the true value. Keep as many decimal places as possible through each step, and only round at the very end.

