How to Check for Normality in Statistics: Methods Explained

Checking for normality involves a combination of visual methods and statistical tests, not just one approach. The best practice is to start with plots that let you see your data’s shape, then confirm what you see with a formal test like the Shapiro-Wilk. Relying on only one method can mislead you, especially with small or very large samples.

Why Normality Matters

Many common statistical tests, including the t-test and ANOVA, assume your data are approximately normally distributed. If that assumption is violated, the p-values those tests produce can be unreliable, potentially leading you to conclusions the data don’t actually support. Checking normality tells you whether you can trust a parametric test or whether you need to switch to a non-parametric alternative.

That said, normality doesn’t need to be perfect. These tests are reasonably robust to mild departures from normality, especially as sample size grows. The central limit theorem tells us that once your sample reaches about 30 observations, the sampling distribution of the mean approximates a normal distribution regardless of the population’s shape. So normality checking is most critical when your sample is small or when your data look obviously skewed.

Start With Visual Methods

Before running any formal test, look at your data. Two plots do the heavy lifting here: histograms and Q-Q plots.

A histogram gives you the overall shape. You’re looking for the classic bell curve: roughly symmetric, with most values clustered around the center and tails that taper off evenly on both sides. If the bulk of your data bunches to the left or right with a long tail stretching the other direction, your data are skewed. If you see two humps instead of one, you may have a bimodal distribution, which is a different problem entirely.

A Q-Q (quantile-quantile) plot is more precise. It plots your data’s quantiles against the quantiles you’d expect from a perfect normal distribution. If your data are normal, the points fall along a roughly straight diagonal line. Here’s what deviations tell you:

Points curving away from the line in an arc: your data are skewed. A curve bowing upward suggests right skew, while one bowing downward suggests left skew.
Points following the line in the middle but curving off at the ends: your data have heavier tails than a normal distribution, meaning more extreme values than expected.
Points hugging the line closely throughout: normality is a safe assumption.

Q-Q plots are generally more informative than histograms because histograms change appearance depending on how many bins you choose. A Q-Q plot gives you a more stable, interpretable picture.

Skewness and Kurtosis as Quick Checks

Most statistical software will calculate skewness and kurtosis for your data, and these two numbers give you a fast numerical read on normality. A perfectly normal distribution has a skewness of zero (perfectly symmetric) and an excess kurtosis of zero (neither too peaked nor too flat).

In practice, your values won’t be exactly zero. The question is how far off they are. For samples larger than 300, widely used reference thresholds treat an absolute skewness value above 2 or an absolute kurtosis value above 7 as signs of substantial non-normality. Values within those bounds generally indicate your data are close enough to normal for most purposes. For smaller samples, you can convert skewness and kurtosis into z-scores by dividing each by its standard error, then check whether those z-scores fall outside the range you’d expect from a normal distribution.

The Shapiro-Wilk Test

If you’re going to run one formal normality test, make it the Shapiro-Wilk. It consistently outperforms other options in detecting non-normality, and many statisticians consider it the best overall choice. The test works by measuring how well your data correlate with the values you’d expect from a normal distribution.

Interpreting the result is straightforward but works backwards from what many people expect. The null hypothesis is that your data are normal. So a p-value above 0.05 means you don’t have enough evidence to reject normality, and you can proceed with parametric tests. A p-value below 0.05 means your data deviate significantly from a normal distribution.

The Shapiro-Wilk test is especially well-suited for small samples (fewer than 50 observations), where it has the strongest statistical power to catch non-normality. Most major software packages, including SPSS, R, and Python’s SciPy library, include it.

Other Formal Tests and Their Limitations

The Kolmogorov-Smirnov (K-S) test is probably the most widely known normality test, but its reputation exceeds its usefulness. It has low power compared to the Shapiro-Wilk, meaning it’s more likely to miss genuine non-normality. It also requires that the distribution parameters (mean and standard deviation) be known in advance rather than estimated from the data, which is almost never the case in real-world analysis.

The Lilliefors correction fixes this by adjusting the K-S test for estimated parameters. Software like SPSS applies this correction automatically when you run a K-S normality test. Even with the correction, though, the Shapiro-Wilk still provides better detection power. The Anderson-Darling test is another option that performs well, particularly for catching departures in the tails of the distribution, but it’s less commonly available in standard software menus.

One important caveat applies to all formal tests: with very large samples (several hundred or more), they become oversensitive. They’ll flag tiny, practically meaningless deviations from normality as statistically significant. This is why visual methods and skewness/kurtosis values remain important even when you have a formal test result.

A Practical Checking Strategy

Combine methods rather than relying on any single one. A sensible workflow looks like this:

Plot first. Create a Q-Q plot and a histogram. If the Q-Q plot shows points tightly along the diagonal and the histogram looks roughly bell-shaped, you’re likely fine.
Check skewness and kurtosis. If absolute skewness is under 2 and absolute kurtosis is under 7, you’re within acceptable bounds for most analyses.
Run the Shapiro-Wilk test. Use it to confirm what the visuals suggest, especially with small samples. For samples under 50, this test carries the most weight.
Consider your sample size. With 30 or more observations, the central limit theorem provides a safety net for analyses based on the mean. With hundreds of observations, give more weight to the visual check and skewness/kurtosis than to a formal test that may be overly sensitive.

What To Do When Data Aren’t Normal

If your checks reveal non-normality, you have two main options: transform the data or switch to a non-parametric test.

The most common transformation is the log transformation, which compresses right-skewed data by pulling in the long tail. It works well for many biological and financial datasets. However, log transformation doesn’t always produce normality. A square root transformation is another option for moderately skewed data. For a more flexible approach, the Box-Cox transformation searches for the optimal power transformation automatically, and it can back-transform results to the original scale afterward. Box-Cox is generally preferred when a simple log transform doesn’t do the job.

If transforming feels forced or doesn’t solve the problem, non-parametric tests are the cleaner alternative. These tests don’t assume normality at all. The common swaps are:

Instead of a paired t-test: use the Wilcoxon signed-rank test
Instead of a two-sample t-test: use the Mann-Whitney test
Instead of one-way ANOVA: use the Kruskal-Wallis test

Non-parametric tests compare medians or ranks rather than means, so they handle skewed data and outliers without requiring any transformation. The tradeoff is slightly less statistical power when data actually are normal, but with non-normal data, they’re the more trustworthy choice.