In ANOVA (analysis of variance), “sum of squares” refers to a way of measuring how spread out your data is, broken into meaningful pieces. The total variability in your dataset gets split into two parts: the variability caused by differences between your groups, and the variability left over within each group. This partitioning is the core logic of ANOVA. It lets you ask a precise question: is the spread between group averages large enough, relative to the natural spread inside the groups, to conclude that your groups are genuinely different?
How Total Variability Gets Split in Two
Imagine you’ve measured something (test scores, blood pressure, plant height) across several groups. Every individual data point sits some distance from the overall average of the entire dataset. If you take each of those distances, square them, and add them all up, you get the total sum of squares, or SS Total. It captures the entire amount of variation in your data, without caring which group any observation belongs to.
The key insight of ANOVA is that this total can be cleanly separated into two components:
- Sum of squares between groups (SS Between): How far each group’s average is from the overall average, squared, and weighted by the number of observations in that group. This measures whether the groups themselves differ from one another.
- Sum of squares within groups (SS Within): How far each individual observation is from its own group’s average, squared and summed. This measures the natural scatter of individuals inside each group, the variation that has nothing to do with which group they’re in.
The relationship is simple addition: SS Total = SS Between + SS Within. Every bit of variation in your data is accounted for by one piece or the other. There’s no overlap and nothing left out.
What SS Between Tells You
SS Between captures the variation driven by group membership. If you’re comparing three teaching methods and the average scores for each method are very different from the overall average, SS Between will be large. If all three methods produce nearly identical averages, SS Between will be close to zero.
Think of it this way: SS Between answers, “How much of the total spread in my data can I attribute to the groups being different?” A large SS Between, relative to the total, suggests the grouping variable (teaching method, drug dosage, fertilizer type) is doing something meaningful.
What SS Within Tells You
SS Within (also called SS Error) captures everything that SS Between doesn’t. It’s the variation among individuals who are in the same group. Even people given the same teaching method will score differently because of personal ability, motivation, luck on test day, and countless other factors. All of that individual-level noise lands in SS Within.
If every person in a group scored identically, SS Within would be zero. In practice, it’s never zero. The size of SS Within sets the baseline for how much random variation you should expect. This is what makes it so important: it’s the yardstick against which ANOVA judges whether the between-group differences are impressive or unremarkable.
From Sum of Squares to the F-Statistic
Sum of squares values on their own aren’t directly comparable because they depend on how many groups you have and how many observations are in each group. To make a fair comparison, ANOVA divides each sum of squares by its degrees of freedom to produce a “mean square.”
For SS Between, the degrees of freedom equal the number of groups minus one (k – 1). For SS Within, the degrees of freedom equal the total number of observations minus the number of groups (N – k). Dividing gives you Mean Square Between and Mean Square Within.
The F-statistic is simply the ratio of these two: Mean Square Between divided by Mean Square Within. If the groups are truly no different from each other, you’d expect this ratio to hover around 1, because both mean squares would be estimating the same underlying variability. When F climbs well above 1, it suggests the between-group differences are too large to chalk up to random noise. ANOVA then converts this F value into a p-value to tell you how unlikely such a result would be if the groups were actually identical.
In a published example comparing dental materials across groups, the between-group sum of squares was 273.9 and the within-group sum of squares was 3,282.8. With 2 and 87 degrees of freedom respectively, the resulting mean squares produced an F-statistic of 3.63 and a p-value of 0.031, enough to conclude the materials performed differently.
Measuring How Much Your Groups Explain
Once you have your sums of squares, you can calculate how much of the total variation your grouping variable accounts for. The most common measure is eta-squared (η²), which is simply SS Between divided by SS Total. The result is a proportion between 0 and 1, interpreted the same way as R² in regression.
For example, in a study comparing material types, SS Between was 1,997.0 and SS Total was 5,863.7. Dividing gives an eta-squared of 0.341, meaning the material variable explained 34.1% of the variability in the outcome. The remaining 65.9% was within-group variation, driven by factors the model didn’t capture. This kind of calculation turns an abstract sum of squares into a practical statement about how important your grouping variable really is.
Reading an ANOVA Table
Most statistical software outputs an ANOVA table with a standard layout. Knowing what each column means makes the results much easier to interpret:
- Source: Lists the rows, typically “Between” (or “Treatment”) and “Within” (or “Error”), plus sometimes a “Total” row.
- SS: The sum of squares for each source.
- df: Degrees of freedom. For between, it’s k – 1. For within, it’s N – k.
- MS: Mean square, which is SS divided by df.
- F: The ratio of Mean Square Between to Mean Square Within.
- p-value: The probability of seeing an F this large if there were truly no group differences.
When you look at this table, the sum of squares column is where the story starts. A large proportion of SS sitting in the “Between” row relative to the “Total” row means your groups explain a meaningful chunk of the data’s variability. A small proportion means the group differences are minor compared to the noise within groups. The F-statistic and p-value then formalize whether that proportion is statistically significant, but the sums of squares give you the raw material to judge the practical size of the effect yourself.

