Residual variance is the portion of variation in your data that a statistical model fails to explain. If you build a regression model predicting house prices based on square footage, for example, the model will nail some predictions and miss others. The gaps between what the model predicts and what actually happened, squared and averaged, give you the residual variance. It’s essentially a scorecard for how much “noise” or unexplained variation is left over after your model has done its best work.
How Residual Variance Is Calculated
The calculation starts with residuals. A residual is the difference between an observed value and the value your model predicted for that data point. If a home sold for $320,000 but your model predicted $305,000, the residual is $15,000.
To get from individual residuals to residual variance, you follow three steps. First, calculate the residual for every data point by subtracting each predicted value from the actual value. Second, square each residual so that positive and negative errors don’t cancel each other out. Third, sum all the squared residuals. This gives you the residual sum of squares (RSS), the raw total of unexplained variation.
To turn that sum into a variance estimate, you divide by the degrees of freedom, which is the number of observations minus the number of parameters your model estimated. In simple linear regression with one predictor, that’s n minus 2. This quantity is called the mean square error, and it’s the standard estimate of residual variance. Taking the square root of it gives you the residual standard error, a number reported in most regression software output that tells you, roughly, how far off a typical prediction is in the same units as your data.
What It Tells You About Your Model
Residual variance is one of the most direct indicators of model quality. A low residual variance means your predictor variables are capturing most of the variation in the outcome. A high residual variance means something important is missing: either your model is too simple, you’re missing a key variable, or the data itself is inherently noisy.
The connection to R-squared makes this concrete. R-squared equals 1 minus the ratio of residual variation to total variation. If your data’s total variation is 1,000 and the residual variation is 200, your R-squared is 0.80, meaning the model explains 80% of what’s going on. The remaining 20% is residual variance, the stuff your model can’t account for. So every time you see an R-squared value, you’re really looking at residual variance expressed as a proportion of the whole picture.
This ratio is especially useful when comparing models. Adding a new predictor variable to your regression should, in theory, reduce residual variance. If it doesn’t meaningfully shrink, that variable probably isn’t contributing much.
Residual Variance in ANOVA
In analysis of variance (ANOVA), residual variance goes by several names: within-group variation, random variation, unexplained variation, or error variation. They all refer to the same thing, the spread of individual data points around their group mean.
An ANOVA table breaks total variation into two buckets: variation between groups (caused by whatever factor you’re testing) and variation within groups (the residual). The residual sum of squares is divided by its degrees of freedom to produce a mean square value. The F-statistic, the number that determines whether group differences are statistically significant, is simply the ratio of between-group mean square to within-group (residual) mean square. A large F-value means the signal from your groups is big relative to the background noise. A small one means the residual variance is drowning out any group differences.
To illustrate: in a published ANOVA comparing body mass across species and sex, the residual sum of squares was 2,200,000 with 174 degrees of freedom, giving a residual mean square of 12,644. The species factor had a mean square of 114,319, producing an F-value of 9.04 and strong statistical significance. The residual variance provided the baseline against which those group differences were measured.
Errors vs. Residuals
These two terms get used interchangeably in casual conversation, but they refer to different things. An error is the difference between an observed value and the true population value, which you can never actually observe. A residual is the difference between an observed value and your model’s predicted value, which you can observe and calculate directly.
Residuals are what you work with in practice. Their sum always equals zero in a fitted regression model, and they’re not independent of each other. True errors, by contrast, don’t have to sum to zero and are theoretically independent. This distinction matters because when people talk about checking model assumptions, they’re always working with residuals as stand-ins for the unobservable true errors.
Why Constant Residual Variance Matters
One of the core assumptions in linear regression is homoscedasticity, which simply means the residual variance stays roughly the same across all levels of your predictor. If you’re predicting home sale prices from square footage, the scatter of residuals around the prediction line should be about the same width whether you’re looking at small homes or large ones.
When this assumption breaks down, you get heteroscedasticity, where the spread of residuals changes. A common pattern is residuals fanning out as predicted values increase: your model might predict prices for small homes quite accurately but miss wildly on expensive ones. This creates two practical problems. First, your confidence intervals become unreliable, often too narrow in the high-variance region and too wide in the low-variance region. Second, the model can give too much weight to the noisiest subset of your data when estimating its coefficients, skewing the entire model.
You can spot this by plotting residuals against fitted values. If the vertical spread of points stays consistent from left to right, homoscedasticity holds. If the spread widens or narrows, your residual variance isn’t constant and your model’s standard errors need correction.
Using Residual Plots to Improve Models
Plotting residuals is one of the most practical things you can do with residual variance. A well-behaved residual plot looks like a random cloud of points centered on zero with no visible pattern. When something goes wrong, the plot tells you what.
A curved pattern in the residuals means you’ve fit a straight line to data that has a nonlinear relationship. The fix might be adding a squared term or transforming one of your variables. A funnel shape, where residuals spread wider as fitted values increase, signals nonconstant variance. In a stopping-distance study, for instance, prediction intervals for how far a car travels before stopping grow increasingly wide at higher speeds, reflecting the fact that variability in real braking distances genuinely increases with speed.
Clusters or gaps in residual plots can point to missing categorical variables or subgroups in your data that the model hasn’t accounted for. In each case, the residual variance is telling you where your model’s understanding of the data breaks down and where there’s room to improve it.

