What Is Levene’s Test? Equality of Variances Explained

Levene’s test is a statistical test that checks whether two or more groups have equal variances. It answers a simple but important question: is the spread of data similar across your groups? This matters because many common tests, especially ANOVA and the independent-samples t-test, assume equal variances before they can produce reliable results. Levene’s test is the standard way to verify that assumption before you proceed.

Why Equal Variances Matter

Variance is a measure of how spread out values are within a group. When you compare group averages using ANOVA or a t-test, those tests pool the variances from each group into a single estimate of variability. If one group’s data is tightly clustered while another’s is widely scattered, that pooled estimate becomes unreliable, and so do your results.

This assumption, called homogeneity of variance, doesn’t require that your groups have identical spreads. Small differences are fine. The concern is when variances differ enough to distort the comparison of means. Levene’s test gives you a formal, objective way to check rather than eyeballing your data and guessing.

How the Test Works

Levene’s test reframes a question about variances into a question about means, which makes it cleverer than it first appears. For each data point, the test calculates how far that point sits from its group’s center (typically the group mean or median). These distances are called absolute deviations. The test then runs what is essentially a one-way ANOVA on those deviation scores, asking: do the groups differ in their average distance from center?

If one group has a large average deviation, its data is more spread out. If all groups have similar average deviations, their variances are roughly equal. By converting the problem into a comparison of means, Levene’s test can use the well-understood F-distribution to produce a p-value.

The Null and Alternative Hypotheses

The null hypothesis states that all group variances are equal. The alternative hypothesis states that at least one pair of groups has unequal variances. Note that a significant result doesn’t tell you which specific groups differ, only that the assumption of equal variances is violated somewhere.

Interpreting the P-Value

If the p-value is above your significance threshold (typically 0.05), you fail to reject the null hypothesis. This means you have no evidence that variances differ, and you can proceed with tests that assume equal variances, like a standard ANOVA.

If the p-value is below 0.05, the variances are significantly unequal. At that point, you have a few options. For a t-test, most software automatically offers Welch’s t-test, which doesn’t require equal variances. For ANOVA, you can switch to Welch’s ANOVA or use a nonparametric alternative like the Kruskal-Wallis test. The key is that you don’t ignore the result and run your analysis as planned.

Three Versions of the Test

Levene’s original 1960 version calculates each data point’s distance from its group mean. This works well when your data follows a normal distribution, but it can give misleading results when the data is skewed or has heavy tails.

The Brown-Forsythe variation, introduced in 1974, uses the group median instead of the mean. Because the median is less sensitive to extreme values, this version performs better with skewed distributions. It’s the most commonly recommended version for general use, and many software packages now default to it.

A third variation uses a trimmed mean (cutting off a percentage of extreme values from each end before calculating the center). This offers a middle ground between the mean and median approaches. In practice, though, the median-based Brown-Forsythe version covers most situations well enough that the trimmed-mean variant sees limited use.

Levene’s Test vs. Bartlett’s Test

Bartlett’s test is the other widely used option for checking equal variances. The critical difference is sensitivity to non-normal data. Bartlett’s test assumes your data is normally distributed. If it isn’t, Bartlett’s will often reject the null hypothesis even when variances are actually equal, giving you a false alarm.

Levene’s test is far more robust to departures from normality. Because it works with absolute deviations rather than raw variance calculations, it tolerates skewed data, heavy-tailed data, and other non-normal shapes without losing accuracy. This robustness is the main reason Levene’s test has become the default choice in most applied research. If you know your data is normally distributed, Bartlett’s test has slightly more statistical power. If you’re unsure, or if your data is clearly non-normal, Levene’s test (or specifically the Brown-Forsythe median variant) is the safer choice.

When You’ll Encounter It

The most common scenario is as a preliminary check before ANOVA. Say you’re comparing test scores across three teaching methods. Before running a one-way ANOVA to compare the group means, you’d run Levene’s test to confirm the score variability is similar across all three groups. If it is, you proceed with ANOVA. If not, you adjust your approach.

You’ll also see it before independent-samples t-tests. SPSS, for example, automatically includes Levene’s test in its t-test output and presents two rows of results: one assuming equal variances and one not assuming them. The Levene’s test result tells you which row to report.

In clinical trial research, verifying equal variances across treatment groups is a routine step before comparing outcomes. Any time you’re comparing means across groups and using a method that pools variance estimates, checking homogeneity of variance with Levene’s test is standard practice.

Running It in Software

In Python, the function scipy.stats.levene() takes your group arrays as arguments and returns the test statistic and p-value. By default, SciPy uses the median-based (Brown-Forsythe) variant, which you can change with the center parameter to 'mean' or 'trimmed'.

In R, the leveneTest() function from the car package is the standard tool. It also defaults to using the median. You pass it a formula specifying your response variable and grouping variable.

In SPSS, Levene’s test is built into the one-way ANOVA and independent-samples t-test procedures. For ANOVA, you’ll find it under the “Homogeneity of variance test” option. For the t-test, it appears automatically in the output. No additional setup is needed.

Common Mistakes to Avoid

The biggest mistake is treating a non-significant Levene’s test as proof that variances are equal. A non-significant result only means you lack evidence of a difference. With small sample sizes, Levene’s test has limited power to detect real variance differences, so a passing result with 10 observations per group is far less reassuring than one with 100 per group.

Another common error is running Levene’s test on data with very small groups. The test needs a reasonable amount of data in each group to work reliably. With fewer than about 10 observations per group, the test may fail to detect even large variance differences.

Finally, some researchers run Levene’s test and then ignore a significant result because it complicates their analysis plan. If the test tells you variances are unequal, that information matters. Using a method that assumes equal variances anyway can inflate your false positive rate, meaning you might claim a difference between groups that doesn’t actually exist.