Is the T-Test Parametric or Nonparametric?

The t-test is a parametric test. This means it makes specific assumptions about the data it analyzes, most importantly that the data follows a roughly normal (bell-shaped) distribution. When those assumptions hold, the t-test is a powerful tool for comparing means between groups. When they don’t, you need a nonparametric alternative instead.

Understanding why the t-test is parametric, what that actually means in practice, and when to switch to a different test will help you choose the right approach for your data.

What Makes a Test “Parametric”

A parametric test assumes your data comes from a population that can be described by a set of fixed parameters, like a mean and standard deviation. The t-test specifically assumes the underlying population has a normal distribution. It uses those parameters to calculate whether two groups are meaningfully different or whether a sample mean differs from a known value.

Nonparametric tests, by contrast, don’t assume any particular distribution shape. Instead of working with means and standard deviations, they typically rank the data from smallest to largest and analyze those ranks. This makes them more flexible but, under ideal conditions, slightly less precise.

The Three Assumptions Behind a T-Test

Because the t-test is parametric, your data needs to satisfy three conditions before the results are trustworthy:

Normality: The data in each group should be approximately normally distributed. This matters most with small samples. With larger samples (generally 50 or more), the math behind the t-test becomes reliable even when data isn’t perfectly normal, thanks to a principle called the Central Limit Theorem. Some statisticians recommend samples of at least 100 before trusting a t-test on clearly non-normal data.
Equal variance: The spread of data in each group should be roughly similar. If one group’s values are tightly clustered and the other’s are widely scattered, the standard t-test can give misleading results. There’s a modified version called the Welch t-test that handles unequal variances, which many software tools now use by default.
Independence: Each observation should be unrelated to the others. One person’s measurement shouldn’t influence another’s. The exception is the paired t-test, which is specifically designed for linked observations (like the same person measured before and after a treatment).

Your data also needs to be measured on a numerical scale, meaning actual quantities like weight, temperature, or test scores. You can’t run a t-test on categories or rankings.

How to Check if Your Data Qualifies

Before running a t-test, you should verify that your data is at least approximately normal. There are both visual and statistical ways to do this.

The quickest visual check is a histogram. If it’s roughly bell-shaped and symmetric around the center, your data is likely normal enough. Q-Q plots are another useful visual tool: they plot your data against what a perfect normal distribution would look like, and deviations from a straight diagonal line suggest non-normality. Box plots can also reveal skewness or extreme outliers that would violate the assumption.

For a more formal check, two statistical tests are widely used. The Shapiro-Wilk test is the go-to for smaller samples (under 50) because it has the most power to detect non-normality. For samples of 50 or more, the Kolmogorov-Smirnov test works well, as do checks based on skewness and kurtosis. In both tests, a significant result (typically p < 0.05) means your data departs from a normal distribution enough to reconsider using a t-test. A rough rule of thumb: if skewness and kurtosis values both fall between -1 and +1, the distribution is approximately normal.

What Happens When Assumptions Are Violated

The t-test is reasonably robust, meaning it can tolerate minor departures from its assumptions without producing wildly wrong results. Small deviations from normality in samples of 30 or more rarely cause problems. Mild differences in variance between groups are similarly tolerable, especially when the groups are about the same size.

Severe violations are a different story. If your data is heavily skewed, contains major outliers, or the two groups have very different spreads combined with very different sample sizes, the t-test can produce unreliable p-values. In these cases, you’re better off switching to a nonparametric alternative.

For unequal variances specifically, you don’t necessarily need to abandon the t-test entirely. The Welch t-test adjusts both the standard error calculation and the degrees of freedom to account for the mismatch. It uses a more complex formula that factors in each group’s individual variance rather than pooling them together. When both groups have the same sample size, the Welch test and the standard t-test give identical results, so many analysts simply use the Welch version by default.

Nonparametric Alternatives to the T-Test

When your data clearly violates the normality assumption and your sample is too small for the Central Limit Theorem to help, two nonparametric tests serve as direct replacements:

Mann-Whitney U test (also called the Wilcoxon-Mann-Whitney test): This replaces the independent samples t-test. Instead of comparing the means of two groups, it ranks all the data points from lowest to highest and compares the sum of ranks between groups. Because it works with ranks rather than raw values, the data doesn’t need to follow any particular distribution.
Wilcoxon signed-rank test: This replaces the paired t-test. It’s designed for situations where you have two linked measurements (before and after, left and right, etc.) and the differences between pairs aren’t normally distributed.

The common claim is that parametric tests are always more powerful than nonparametric ones. In practice, the advantage is small. Simulations comparing the t-test and its nonparametric counterparts show that when all assumptions are perfectly met, the t-test squeezes out only a tiny power advantage. And when assumptions are violated, the nonparametric alternative can actually be more powerful. Since real-world data frequently violates at least one assumption, the practical difference is often negligible.

Choosing the Right Test for Your Data

The decision tree is straightforward. Start by determining whether your data is numerical and whether you’re comparing two independent groups or two paired measurements. Then check normality using the Shapiro-Wilk test (for samples under 50) or the Kolmogorov-Smirnov test (for larger samples), supplemented by a visual check of your histogram or Q-Q plot.

If the normality test comes back non-significant (meaning your data looks normal enough), use the t-test. If you’re comparing two independent groups and worried about unequal variances, use the Welch t-test. If the normality test is significant (meaning your data departs from normal), use the Mann-Whitney U test for independent groups or the Wilcoxon signed-rank test for paired data.

With large samples, you have more flexibility. The t-test handles mild to moderate non-normality well once you’re above 50 observations per group, though strongly skewed distributions with outliers can still cause trouble even at larger sizes. When in doubt, running both the parametric and nonparametric versions is a reasonable check. If they give you the same conclusion, you can report either with confidence.