What Is Levene’s Test for Equality of Variances?

Levene’s test is a statistical test that checks whether two or more groups have equal variances. It answers a simple question: is the spread of data in each group roughly the same, or is one group much more variable than another? This matters because many common statistical methods, including t-tests and ANOVA, assume that group variances are equal. Running Levene’s test first tells you whether that assumption holds.

Why Equal Variances Matter

Many statistical tests compare group means. A t-test might compare average test scores between two classrooms, and an ANOVA might compare blood pressure across three treatment groups. These tests assume that even if the group averages differ, the spread of values within each group is similar. This property is called homogeneity of variances (or homoscedasticity).

When variances are unequal, the p-values from a standard t-test or ANOVA can be misleading. You might conclude there’s a real difference between groups when there isn’t one, or miss a difference that’s actually there. Levene’s test gives you a way to check this assumption before running your main analysis, so you can adjust your approach if needed.

How the Test Works

The core idea behind Levene’s test is straightforward. For each data point, the test calculates how far that point falls from the center of its group. It then runs what is essentially a one-way ANOVA on those distances. If one group’s data points tend to be farther from their group center than another group’s, the variances are likely unequal.

The test produces an F statistic with degrees of freedom based on the number of groups and the total sample size. You compare this F value to an F distribution to get a p-value. The hypotheses are:

  • Null hypothesis: All groups have equal variances.
  • Alternative hypothesis: At least one group has a variance different from the others.

Interpreting the Results

The output of Levene’s test gives you an F statistic, degrees of freedom, and a p-value. If the p-value is above your significance threshold (typically 0.05), you fail to reject the null hypothesis, meaning there’s no strong evidence that the variances differ. You can proceed with your planned t-test or ANOVA as usual.

If the p-value is below 0.05, the test is flagging that group variances are significantly different. In that case, you’d switch to a method that doesn’t assume equal variances. For a two-sample comparison, that means using Welch’s t-test instead of Student’s t-test. For an ANOVA, you could use Welch’s ANOVA or the Brown-Forsythe F-test.

A typical way to report the result in academic writing looks like this: “A Levene’s test indicated that the variances were homogeneous, F(1, 14) = 0.47, p = .506.” The numbers in parentheses are the degrees of freedom, followed by the F value and p-value.

Three Versions of the Test

Levene’s original test measures each data point’s distance from its group mean. But there are two common variations that use different measures of center, and the choice matters depending on your data’s shape.

  • Mean-based (original Levene’s): Uses each group’s mean as the center point. Works well when your data are approximately normally distributed.
  • Median-based (Brown-Forsythe): Uses each group’s median instead. This version is more robust when data are skewed or have heavy tails, because the median is less affected by extreme values.
  • Trimmed mean-based: Uses a 10% trimmed mean, which drops the highest and lowest 10% of values before averaging. This offers a middle ground between the mean and median approaches.

Most statistical software defaults to either the mean-based or median-based version. SPSS uses the mean-based version by default, while R’s leveneTest() function in the car package defaults to the median. If you’re unsure which to pick and your data might not be perfectly normal, the median-based version is the safer choice.

Levene’s Test vs. Bartlett’s Test

Bartlett’s test checks the same thing as Levene’s test: whether group variances are equal. The key difference is sensitivity to non-normal data. Bartlett’s test performs well when your data are truly normally distributed, but it falls apart quickly when they’re not. Simulation studies show that with skewed distributions, Bartlett’s test produces false positive rates near 10% across most sample sizes, meaning it incorrectly flags variance differences about twice as often as it should at a 5% significance level.

Levene’s test is far more forgiving. Even with skewed or heavy-tailed data, its error rates stay closer to the expected level, around 6% at high sample sizes. This robustness is the main reason Levene’s test has become the default choice in most applied research. If you know your data are normally distributed, Bartlett’s test is a valid (and slightly more powerful) option. If there’s any doubt about normality, Levene’s test is the better pick.

When to Use Levene’s Test

You’ll most commonly run Levene’s test as a preliminary step before an independent-samples t-test or a one-way ANOVA. It’s appropriate whenever you’re comparing a continuous outcome across two or more independent groups and your planned analysis assumes equal variances.

There are a few situations where you can skip it. If you’re already planning to use Welch’s t-test or Welch’s ANOVA, those methods don’t assume equal variances, so testing for it is unnecessary. Some statisticians argue you should always use Welch’s versions by default and skip the preliminary test altogether, since the Welch correction has minimal cost when variances happen to be equal but protects you when they’re not.

Levene’s test also requires that observations within each group are independent of one another. It does not require perfectly normal data (especially the median-based version), but it does assume that each data point is a separate, unrelated measurement. Repeated measures from the same participants, for example, would violate this assumption.