A standardized residual is a raw residual (the difference between an observed value and a predicted value) divided by an estimate of its standard deviation. This division puts residuals on a common scale, turning them into something like z-scores so you can compare them directly and spot outliers. They show up in two main areas of statistics: regression analysis and chi-square tests of categorical data.
Why Raw Residuals Need Standardizing
In any statistical model, a residual is the gap between what you actually observed and what the model predicted. If your model predicts a test score of 82 and the student scored 88, the residual is 6. Simple enough. The problem is that raw residuals depend entirely on whatever units you’re measuring. A residual of 6 points on a test means something very different from a residual of 6 kilograms in a weight prediction or 6 million dollars in a revenue forecast.
Even within the same dataset, not all residuals are created equal. Some data points have more influence on the model than others depending on where they sit relative to the rest of the data, which means their residuals naturally have different amounts of variability. A raw residual of 6 for one observation might be completely expected, while a raw residual of 6 for another might be highly unusual. Standardizing solves both problems by expressing each residual in terms of standard deviations from zero. Once you do that, you can directly compare residuals across observations, across variables, and even across entirely different models.
How the Calculation Works
The core idea is straightforward: take each raw residual and divide it by an estimate of its standard deviation. In regression, this estimate comes from the mean square error of the model, which is essentially the average squared residual across all your data points. The result is a number that tells you how many standard deviations a given observation falls from where the model expected it to be.
Because standardized residuals are scaled this way, they behave roughly like values from a standard normal distribution (mean of 0, standard deviation of 1). That’s what makes them so useful. Under a normal distribution, about 95% of values fall within 2 standard deviations of the mean, and about 99.7% fall within 3. So if you see a standardized residual of 4.5, you know immediately that something unusual is going on with that data point.
Using Standardized Residuals to Find Outliers
The most common use of standardized residuals is identifying outliers, data points that don’t fit the pattern your model describes. The general rule of thumb: any observation with a standardized residual larger than 2 in absolute value deserves a closer look. Many statistical software packages automatically flag these. A stricter threshold puts the cutoff at 3, meaning only observations more than 3 standard deviations from the predicted value get flagged.
Which threshold you use depends on context. With a large dataset, you’d expect a few standardized residuals above 2 just by chance, so a cutoff of 3 makes more sense. With a smaller dataset, the threshold of 2 is more practical. Either way, exceeding the threshold doesn’t automatically mean the data point is wrong or should be removed. It means the observation warrants investigation. Maybe there’s a data entry error, maybe that case is genuinely different from the rest, or maybe the model itself is missing something important.
Standardized Residuals in Chi-Square Tests
Standardized residuals aren’t limited to regression. They play a critical role in chi-square tests, where you’re comparing observed counts in a table against what you’d expect if two categorical variables were independent. A significant chi-square result tells you that the overall pattern deviates from what you’d expect by chance, but it doesn’t tell you which specific cells in the table are driving that result. Standardized residuals fill that gap.
Each cell in the table gets its own standardized residual, calculated as the difference between the observed count and the expected count, divided by an estimate of the standard deviation. The adjusted version of this residual follows a standard normal distribution, which means you can treat the values like z-scores. Cells with absolute values greater than 1.96 are statistically significant at the 0.05 level. In practice, many researchers use the simpler cutoff of 2 or 3 as a quick rule of thumb. A positive standardized residual means that cell has more observations than expected; a negative one means fewer.
For example, if you’re looking at a table of job type by gender and one cell has an adjusted standardized residual of 3.4, that tells you there are significantly more people in that particular combination than the model of independence would predict. This cell-by-cell breakdown is often more informative than the overall chi-square result alone.
Reading Residual Plots
Plotting standardized residuals against predicted values is one of the most common diagnostic tools in regression. Because the residuals are on a standardized scale, you can immediately see whether points fall within the expected range (most should be between -2 and 2) without needing to think about the original units.
Specific patterns in these plots reveal specific problems with your model. If the residuals curve in a systematic way, being positive for small predicted values, negative in the middle, and positive again for large values, that suggests the relationship between your variables isn’t linear and a straight-line model isn’t appropriate. If the residuals fan out as predicted values increase, spreading wider on one side of the plot than the other, that’s a sign of non-constant variance (sometimes called heteroscedasticity). A funneling effect, where residuals are wide on the left and narrow on the right, signals the same issue in reverse. What you want to see is a random cloud of points with no discernible pattern, scattered evenly around zero.
Standardized vs. Studentized Residuals
These two terms are easy to confuse, and some software uses them inconsistently. The key difference comes down to what goes in the denominator. A standardized residual divides the raw residual by an estimate of its standard deviation calculated from the full dataset, including the observation in question. A studentized residual (sometimes called an externally studentized residual) recalculates that estimate after removing the observation, then divides.
Why does that matter? If a data point is a true outlier, it inflates the overall error estimate when it’s included. That inflation makes the standardized residual for that point smaller than it should be, potentially hiding the very outlier you’re trying to detect. By removing the observation first and then calculating the error, studentized residuals avoid this masking effect. For routine diagnostics, standardized residuals work fine. When your primary goal is hunting for influential outliers, studentized residuals are the more reliable tool.
Working With Standardized Residuals in Software
Most statistical packages make standardized residuals easy to generate. In SPSS, they’re labeled ZRESID and can be saved directly to your dataset as a new variable (typically named ZRE_1) for further analysis or plotting. In R, the rstandard() function returns standardized residuals from a fitted regression model, while rstudent() gives you studentized residuals. Python’s statsmodels library includes both through the get_influence() method on regression results.
Once you have them saved, you can create residual-vs-fitted plots, Q-Q plots to check whether the residuals are normally distributed, or simply sort them to find the largest values. Q-Q plots are particularly useful for checking the tails of the distribution, where outliers live. If your standardized residuals fall along a straight diagonal line in a Q-Q plot, the normality assumption holds. Deviations at the ends of that line suggest heavy or light tails in the distribution of errors.

