What Is the Difference Between r and R-Squared?

The correlation coefficient (r) measures the direction and strength of a linear relationship between two variables, while r-squared tells you what percentage of one variable’s behavior is explained by the other. They’re related mathematically, since r-squared is literally r multiplied by itself, but they answer different questions and can lead to very different interpretations of the same data.

What r Tells You

The correlation coefficient, r, is a single number between -1 and +1 that captures two things at once: how strong a linear relationship is and which direction it goes. An r of +1 means a perfect positive relationship (as one variable increases, the other increases in lockstep). An r of -1 means a perfect negative relationship (as one goes up, the other goes down proportionally). An r of 0 means no linear relationship at all.

The sign matters. If you’re looking at the relationship between hours of exercise and resting heart rate, you’d expect a negative r, because more exercise tends to correspond with a lower heart rate. If you’re looking at height and weight, you’d expect a positive r. The farther r sits from zero in either direction, the tighter the data points cluster around a straight line.

In practice, perfect correlations almost never appear in real data. A commonly used rule of thumb for interpreting r values looks like this:

  • 0.00 to 0.30 (or 0 to -0.30): Negligible correlation
  • 0.30 to 0.50 (or -0.30 to -0.50): Low correlation
  • 0.50 to 0.70 (or -0.50 to -0.70): Moderate correlation
  • 0.70 to 0.90 (or -0.70 to -0.90): High correlation
  • 0.90 to 1.00 (or -0.90 to -1.00): Very high correlation

These thresholds aren’t rigid laws. In some fields, an r of 0.30 is considered meaningful. A large study in aging research found that the typical “medium” effect size for individual differences was actually around r = 0.20, well below the classic benchmark of 0.30 proposed by the statistician Jacob Cohen. Context determines whether a given r value is impressive or trivial.

What R-Squared Tells You

R-squared (r²) is the coefficient of determination. It tells you the proportion of the variation in one variable that can be explained by the other. While r describes the relationship’s strength and direction, r-squared strips away the direction and answers a more specific question: how much of the outcome’s variability does this predictor actually account for?

The interpretation is straightforward. If r² = 0.64, that means 64% of the variation in your outcome variable is explained by the predictor. The remaining 36% comes from other factors the model doesn’t capture. You can always convert r² into a percentage by multiplying by 100, which makes it one of the more intuitive statistics to communicate.

Because r-squared is a squared value, it’s always between 0 and 1. It can never be negative. This is one of the key differences from r: you lose the information about direction. An r of -0.80 and an r of +0.80 both produce an r-squared of 0.64. If you only see r-squared reported, you can’t tell whether the relationship is positive or negative without additional context.

The Math Connecting Them

In simple linear regression (one predictor, one outcome), the relationship is exactly what it sounds like: r-squared equals r times r. If the correlation between two variables is r = 0.70, then r² = 0.49, meaning about 49% of the variation in the outcome is explained by the predictor. If r = -0.843, then r² = 0.71, or 71%.

This squaring has a compressing effect that catches people off guard. A correlation of r = 0.50, which sounds moderate, produces an r² of only 0.25. That means the predictor explains just 25% of the variation. Three-quarters of what’s happening remains unexplained. Conversely, to get an r² of 0.90 (explaining 90% of the variance), you need an r of about 0.95, which is extraordinarily rare outside of physics or engineering data.

This compression is worth internalizing, because it changes how you evaluate claims. A study reporting r = 0.40 might sound like a solid relationship. Squared, that’s r² = 0.16, meaning only 16% of the variation is explained. Whether that’s meaningful depends on the field, but it’s a much more modest claim than r = 0.40 might initially suggest.

When Each One Is More Useful

Use r when you want to describe the direction and strength of a relationship. It’s the natural choice when comparing two variables and you care about whether the association is positive or negative. Medical research, for example, commonly reports r values when exploring whether two measurements are related, such as whether maternal age correlates with number of pregnancies (one study found r = 0.80, a high positive correlation) or whether hemoglobin levels relate to parity (r = 0.20 to 0.30, negligible to low).

Use r-squared when you want to communicate how well a model predicts or explains an outcome. It’s more common in regression analysis, where the goal is to quantify how much of the variation in a result your predictor captures. R-squared is also easier for non-technical audiences to grasp, because “this model explains 45% of the variation” is a more concrete statement than “the correlation is 0.67.”

Adjusted R-Squared in Multiple Regression

The simple relationship between r and r-squared applies cleanly when there’s a single predictor. Once you add more predictors to a regression model, standard r-squared (sometimes called “multiple R-squared”) has a built-in problem: it always increases when you add another variable, even if that variable is essentially random noise. A predictor will always explain some tiny portion of the variance by chance alone, so more predictors automatically inflate r-squared.

Adjusted r-squared corrects for this by penalizing the addition of predictors that don’t meaningfully improve the model. If you see a large gap between multiple r-squared and adjusted r-squared, that’s a signal the model may be overfitting, meaning it’s capturing noise in the specific dataset rather than real patterns that would hold up in new data. When evaluating models with several predictors, adjusted r-squared gives a more honest picture of how well the model actually performs.

Where Both Metrics Can Mislead

Both r and r-squared assume a linear relationship. If two variables have a strong curved relationship (imagine a U-shape), r can be close to zero even though the variables are clearly related. R-squared will similarly understate the connection. Whenever you calculate either value, it’s worth plotting the data first to check whether a straight line is a reasonable fit.

Outliers are another vulnerability. Even a single extreme data point can dramatically shift the results of a least-squares calculation, which is the math underlying both r and r-squared. One study demonstrated that excluding outliers changed a correlation from 0.2 (negligible) to 0.3 (low positive), enough to alter the statistical conclusion. In datasets with extreme values, both metrics can paint a misleading picture of the typical relationship between variables.

Finally, neither r nor r-squared tells you anything about causation. A high correlation between ice cream sales and drowning deaths doesn’t mean ice cream causes drowning. Both increase in summer. This point gets repeated often, but it’s especially important with r-squared, because saying “X explains 60% of the variance in Y” sounds causal even when it isn’t.

Quick Reference

  • Range: r goes from -1 to +1. R-squared goes from 0 to 1.
  • Direction: r tells you whether the relationship is positive or negative. R-squared does not.
  • Interpretation: r describes strength and direction. R-squared describes the percentage of variance explained.
  • Conversion: In simple linear regression, r-squared = r × r.
  • Scale effect: Squaring compresses values. An r of 0.50 becomes an r-squared of only 0.25.
  • Multiple predictors: Use adjusted r-squared instead of standard r-squared to avoid inflated results.