How to Interpret r in Statistics: From Sign to R-Squared

The correlation coefficient r is a single number that tells you how strongly two variables are related and in which direction. It ranges from -1.0 to +1.0, where 0 means no relationship at all and values closer to -1 or +1 indicate a tighter, more predictable pattern between the two variables. Understanding what your specific r value actually means requires looking at its sign, its size, and the context of your data.

What the Sign Tells You

The sign of r reveals the direction of the relationship. A positive r means that as one variable increases, the other tends to increase too. Think of height and weight: taller people generally weigh more, so that relationship produces a positive r. A negative r means the variables move in opposite directions. As one goes up, the other tends to go down. The number of hours you spend studying and the number of errors on a test would typically show a negative correlation.

The sign alone doesn’t say anything about how strong the relationship is. An r of -0.8 represents a stronger relationship than an r of +0.3, even though the first is negative. Strength comes from how close the value is to -1 or +1, regardless of direction.

How Strong Is Your Correlation?

The most widely used guidelines for interpreting the size of r come from Jacob Cohen’s 1988 benchmarks, which are still the standard across social and behavioral sciences:

Small: r = 0.10. There’s a relationship, but it’s slight. You’d have a hard time spotting it by eye in a scatterplot.
Medium: r = 0.30. A moderate relationship that’s noticeable and often meaningful in practice.
Large: r = 0.50. A strong relationship where the pattern between variables is clearly visible.

These thresholds apply to the absolute value of r, so they work the same way for negative correlations. An r of -0.45 would fall between medium and large. Keep in mind that these are conventions, not laws. In some fields, like physics or engineering, researchers routinely see correlations above 0.90. In psychology or education, an r of 0.30 can be a genuinely important finding. What counts as “strong” depends on what you’re measuring and how much noise is typical in that kind of data.

R-Squared: The Percentage That Matters

One of the most useful things you can do with r is square it. The result, called R-squared or the coefficient of determination, tells you the proportion of variance in one variable that is predictable from the other. It converts an abstract number into something concrete.

If r = 0.50, then R-squared = 0.25. That means 25% of the variation in one variable can be explained by its relationship with the other. The remaining 75% is driven by other factors you haven’t measured. This is where many people get a reality check. A correlation that sounds impressive at r = 0.50 actually accounts for only a quarter of what’s going on. An r of 0.30 gives you an R-squared of just 0.09, meaning roughly 9% of the variance is shared between the two variables.

R-squared ranges from 0 to 1 (when derived from a simple correlation). A value of 0 means the relationship explains none of the variability in the data. A value of 1 would mean perfect prediction, which almost never happens with real-world measurements. Squaring r is a quick way to keep your interpretation grounded, because it reveals how much of the picture your correlation actually captures.

What It Looks Like on a Scatterplot

If you plot your two variables on a scatterplot, the value of r corresponds to how tightly the dots cluster around a straight line. When r is near +1 or -1, the dots form a narrow, elongated oval that hugs a clear diagonal. When r is near 0, the dots spread into a roughly circular cloud with no obvious trend. Values in between produce ovals of varying width. The closer r gets to zero, the wider and rounder the cloud becomes.

Looking at a scatterplot is more than a nice visual exercise. It’s the only way to verify that r is actually telling you something meaningful, because r measures only linear relationships. If your data follows a curved pattern (a U-shape, for instance), r might come back near zero even though the two variables are clearly related. The number alone can mislead you if you skip the plot.

When R Can Mislead You

Pearson’s r assumes a few things about your data, and when those assumptions break down, the number becomes unreliable.

First, the relationship has to be linear. If two variables are related in a curved or nonlinear way, r will underestimate the true strength of their connection. A scatterplot catches this immediately. Second, outliers can distort r dramatically. A single extreme data point can pull the correlation up or down in ways that don’t reflect the overall pattern. Third, the spread of data points should be roughly even across the range of values (a property called homoscedasticity). If the scatter fans out like a trumpet as values increase, r may not accurately summarize the relationship.

When your data is not normally distributed or the relationship isn’t strictly linear but still follows a consistent upward or downward trend, Spearman’s rank correlation (often written as rho) is a better choice. It measures any monotonic relationship, meaning it works as long as one variable consistently increases or decreases with the other, even if the pattern isn’t a straight line. Spearman’s rho uses the same -1 to +1 scale and is interpreted similarly. If you find that the Pearson r is weak but the Spearman correlation is strong, the relationship likely exists but isn’t linear.

Correlation Does Not Mean Causation

This is the most important interpretive rule in all of statistics, and it applies directly to r. A strong correlation between two variables does not mean one causes the other. Both variables might be driven by a third factor you haven’t measured. Ice cream sales and sunscreen sales are highly correlated, but neither one causes the other. Hot weather drives both. Smoking is correlated with heavy alcohol use, but smoking doesn’t cause alcoholism.

By examining r, you can conclude that two variables are related, but the value alone cannot tell you whether one variable caused the change in the other. Establishing causation requires a controlled experiment where you can isolate the effect of one variable while holding everything else constant. Whenever you report or read a correlation, treat it as a description of a pattern, not an explanation of why the pattern exists.

Putting It All Together

When you get an r value, run through a quick checklist. Look at the sign to determine direction. Check the absolute value against the small (0.10), medium (0.30), and large (0.50) benchmarks to gauge strength. Square it to see what percentage of variance is actually shared. Then look at your scatterplot to confirm that the relationship is genuinely linear and that no outliers are warping the result.

Context matters more than any single cutoff. An r of 0.25 in a noisy field like social psychology, where dozens of unmeasured variables influence behavior, can represent a meaningful and replicable finding. The same r of 0.25 in a controlled laboratory experiment measuring physical properties might signal a problem. Always interpret the number relative to what’s typical for your subject area and what practical difference the relationship makes in the real world.