What Is an Outlier in Statistics? Definition & Examples

An outlier in statistics is a data point that falls far outside the pattern of the rest of your data set. If you measured the heights of 20 people in a room and 19 of them were between 5’2″ and 6’1″, but one measurement read 7’8″, that lone value would be an outlier. Outliers matter because even a single extreme value can distort your results, pulling averages away from what’s typical and inflating measures of spread.

How Outliers Are Defined

There’s no single universal cutoff that makes a data point an outlier. Instead, statisticians use rules of thumb based on how far a value sits from the center of the data. The two most common approaches rely on the interquartile range (IQR) and the z-score.

The IQR method, sometimes called Tukey’s fences, works by first finding the middle 50% of your data. You calculate the difference between the 75th percentile (Q3) and the 25th percentile (Q1). That difference is the IQR. Any value below Q1 minus 1.5 times the IQR, or above Q3 plus 1.5 times the IQR, is flagged as an outlier. This is the rule box plots use to decide which points get plotted as individual dots beyond the whiskers.

The z-score method measures how many standard deviations a point sits from the mean. A z-score above 3 or below -3 is the typical threshold for calling something an outlier, meaning the value is more than three standard deviations from the average. This works well when your data follows a roughly bell-shaped distribution, but it can miss outliers in skewed data because the outlier itself pulls the mean and standard deviation in its direction.

Where Outliers Come From

Not all outliers have the same cause, and the cause matters a lot for deciding what to do about them. They generally fall into a few categories.

Some outliers are simply errors. A participant in a survey writes their age as 560 instead of 56, or a lab technician records a decimal point in the wrong place. These are data entry and measurement mistakes, and they’re the least controversial to remove once you confirm the error.

Others come from experimental problems. A sensor malfunctions during one reading, a sample gets contaminated, or a participant misunderstands the instructions. These outliers don’t reflect anything real about the thing you’re measuring.

The most interesting outliers are genuine extreme values. A dataset of household incomes in a small town might contain one billionaire. That value is a real observation, not an error, but it will dramatically skew results if you treat it the same as every other data point. These legitimate outliers often carry the most useful information. They can reveal rare events, subpopulations, or phenomena your study wasn’t expecting to find.

How Outliers Distort Your Results

The mean is the measure most vulnerable to outliers. Because it factors in every value equally, a single extreme number can drag it significantly higher or lower. If five employees earn $50,000 and their CEO earns $5,000,000, the mean salary is about $875,000, a number that describes nobody in the group accurately. The median, by contrast, would be $50,000, which actually reflects what a typical employee earns. This is why income statistics almost always report medians rather than means.

Outliers also inflate the variance and standard deviation of a dataset, making the data appear more spread out than it really is. This can ripple into other analyses. Confidence intervals get wider, hypothesis tests lose power, and correlations between variables can shift in misleading ways. In regression analysis, a single influential outlier can tilt the entire trend line toward itself, changing the slope and predictions for every other data point.

Spotting Outliers Visually

The box plot is the most common visual tool for identifying outliers. The box itself represents the middle 50% of the data, with a line at the median. The whiskers extend to the most extreme non-outlier values, and any points beyond the whiskers (using the 1.5 × IQR rule) appear as individual dots. This makes outliers immediately visible. The CDC, Khan Academy, and most statistics textbooks treat box plots as the default way to display outliers in a single variable.

Scatter plots are more useful when you’re looking at the relationship between two variables. A point that sits far from the cluster or far from a trend line stands out easily. For datasets with more than two or three variables, visual detection gets harder. Multivariate outliers, points that are extreme across several variables at once, often look normal when you examine each variable individually. Detecting them requires calculating a distance measure that accounts for all variables simultaneously.

Measuring an Outlier’s Influence in Regression

In regression analysis, not all outliers are equally disruptive. A data point can be extreme in its x-value, its y-value, or both, and only some of those combinations actually change the results. A metric called Cook’s distance captures this by measuring how much all of the model’s predictions would shift if you removed a single data point. It accounts for both how far the point is from the trend line (its residual) and how extreme its position is along the x-axis (its leverage). A data point with a high Cook’s distance is one that is single-handedly pulling the regression line toward itself. Points with low Cook’s distance, even if they look extreme on a graph, aren’t actually changing your conclusions much.

What to Do With Outliers

The worst thing you can do with an outlier is delete it without thinking. Removing real data points because they’re inconvenient can bias your results just as badly as leaving errors in. The right approach depends entirely on why the outlier exists.

If the outlier is clearly an error, such as a typo or equipment malfunction, correcting or removing it is straightforward. If you can verify the original value, fix it. If you can’t, removing it is usually justified as long as you document what you did and why.

For genuine extreme values, you have several options. One is to use robust statistical methods that aren’t as sensitive to extreme points. The median is the simplest example: it barely moves regardless of how extreme your outliers are. More sophisticated robust methods exist for estimation and regression that down-weight extreme observations automatically.

Two common techniques for taming outliers without removing them are trimming and Winsorizing. Trimming drops a fixed number of the most extreme values from both ends of your data before calculating a statistic like the mean. If you trim the two highest and two lowest values from a dataset of 100, you compute the mean from the remaining 96. Winsorizing takes a different approach: instead of deleting extreme values, it replaces them with the next most extreme value that isn’t being flagged. So the two highest values get replaced with the third-highest value, and the two lowest get replaced with the third-lowest. The full dataset stays the same size, but the extremes are pulled inward. Both methods produce an unbiased estimate of the true average when the data is symmetrically distributed.

Formal Statistical Tests for Outliers

When you need a more rigorous answer than eyeballing a box plot, formal tests exist to evaluate whether a suspected outlier is statistically significant. Grubbs’ test is the most widely recommended for detecting a single outlier in a normally distributed dataset. It works with small samples, typically between 5 and 20 observations, and performs well across that range.

Dixon’s Q test covers similar territory but performs notably worse, especially as sample sizes grow. In Monte Carlo simulations comparing the two, Dixon’s test was consistently the weakest performer. At a sample size of 10, it was roughly 6 to 8 percent less efficient than Grubbs’ test at correctly identifying true outliers. By samples of 15 to 20, that gap widened to 10 to 15 percent. For most practical purposes, Grubbs’ test is the better choice when you suspect a single outlier in a small, normally distributed sample.

Both tests share an important limitation: they assume the underlying data is normally distributed, and they’re designed to test for only one outlier at a time. If your data contains multiple outliers, one can mask another, making neither detectable. More advanced approaches exist for those situations, but they require more specialized software and statistical knowledge.