Measures of association are statistics that estimate the direction and strength of a relationship between two variables. They tell you whether an exposure and an outcome are linked, how strongly, and in which direction. The specific measure you use depends on the type of data you’re working with and how the study was designed. The major categories include ratio measures (like relative risk and odds ratios), difference measures (like absolute risk reduction), correlation coefficients, and effect sizes for categorical data.
Relative Measures: Risk Ratios and Odds Ratios
The two most common measures of association in health research are relative risk and the odds ratio. Both compare an exposed group to an unexposed group, but they do it differently and belong to different study designs.
Relative risk (also called the risk ratio) is the ratio of how often an event occurs in the exposed group compared to the unexposed group. If a group exposed to a chemical develops lung disease at twice the rate of the unexposed group, the relative risk is 2.0. A value of 1.0 means there’s no difference between groups, so no association. Values above 1.0 indicate increased risk, and values below 1.0 indicate reduced risk. Relative risk is used in cohort studies, where researchers follow groups forward in time and can directly count how many people in each group develop the outcome.
The odds ratio compares the odds of an event in one group to the odds in another. It’s the go-to measure for case-control studies, where researchers start with people who already have a disease and look backward at their exposures. In that design, you don’t know the total number of people who were exposed, so you can’t calculate a true risk. You can only compare odds. Odds ratios are also produced by logistic regression, one of the most widely used statistical techniques in clinical research, which means they show up constantly even in studies that aren’t case-control designs.
An important nuance: the odds ratio tends to exaggerate the strength of an association compared to relative risk. When the relative risk is above 1.0, the odds ratio will be even higher. When it’s below 1.0, the odds ratio will be even lower. The two values are close enough to use interchangeably only when the outcome is rare, typically occurring in less than 10% of the population. As event rates climb, the gap between them widens significantly.
Absolute Measures: Risk Difference and NNT
Relative measures tell you how many times more likely something is, but they can be misleading on their own. If a treatment cuts your risk in half, that sounds impressive. But cutting risk from 2 in 10,000 to 1 in 10,000 is very different from cutting it from 20 in 100 to 10 in 100, even though both represent a 50% relative reduction. Absolute measures fill this gap.
The risk difference (also called absolute risk reduction) is simply the difference in event rates between two groups. If 15% of an untreated group develops a disease and 10% of a treated group does, the absolute risk reduction is 5 percentage points. This tells you the actual excess risk associated with an exposure, or the actual benefit of a treatment, in concrete terms.
The number needed to treat (NNT) takes this one step further. It’s the reciprocal of the absolute risk reduction: 1 divided by the risk difference. In the example above, 1 divided by 0.05 gives an NNT of 20, meaning you’d need to treat 20 people for one additional person to benefit. NNT tends to decrease as follow-up time increases, because the longer you observe people, the more events accumulate and the larger the absolute difference becomes. When a treatment effect isn’t statistically significant, the confidence interval for NNT can actually include negative values, which signals the data can’t confirm the treatment helps at all.
Hazard Ratios in Survival Analysis
When studies track how long it takes for an event to happen, not just whether it happens, they use hazard ratios. The “hazard” is an instantaneous risk at any given moment, essentially an event rate. It describes the chance of something occurring at time point t, given that it hasn’t happened yet. A hazard ratio of 0.7 in a cancer trial means the treatment group experiences the event (often death) at 70% the rate of the control group at any given moment.
Hazard ratios are routinely reported as reductions in “risk of death,” but this phrasing is misleading. A hazard ratio of 0.7 is commonly described as a “30% reduction in risk,” but the actual cumulative risk reduction up to any time point is always smaller than what the hazard ratio implies. The hazard ratio describes a reduction in the instantaneous rate of an event, not in the overall probability of experiencing it. This distinction matters when patients and clinicians try to understand the real-world size of a treatment benefit.
Correlation Coefficients for Continuous Data
When both variables are continuous, like height and weight or blood pressure and age, measures of association take the form of correlation coefficients. These range from -1 to +1, where 0 means no association, +1 means a perfect positive relationship, and -1 means a perfect inverse relationship.
The Pearson correlation coefficient (abbreviated “r”) is the most familiar. It measures the strength of a linear relationship between two variables and assumes both are normally distributed. If data points cluster tightly around a straight line on a scatter plot, the Pearson correlation will be close to 1 or -1. It’s sensitive to outliers, though. A few extreme values can dramatically inflate or deflate the result.
The Spearman rank correlation (abbreviated “rho” or rs) works differently. Instead of using actual values, it ranks them and then calculates a correlation on those ranks. This makes it useful for data that isn’t normally distributed, for ordinal data (like survey responses ranked from “strongly disagree” to “strongly agree”), and for relationships that are consistently increasing or decreasing but not perfectly linear. Because it operates on ranks, Spearman’s correlation is also much more resistant to the influence of outliers.
Measures for Categorical Data
When both variables are categories rather than numbers (like treatment group and recovery status), different tools apply. For a simple 2-by-2 table, the phi coefficient works like a correlation coefficient, ranging from 0 (no association) to 1 (perfect association). It’s calculated from the chi-squared statistic and gives a quick sense of effect size.
For larger tables, where one or both variables have more than two categories, phi can produce values above 1 and loses its usefulness. Cramér’s V solves this by adjusting for table size. It’s calculated by dividing the chi-squared value by the sample size and the smaller dimension of the table minus one, then taking the square root. A V of 0 means complete independence between the variables, and a V of 1 means one variable perfectly predicts the other. For a 2-by-2 table, Cramér’s V reduces to the same value as phi.
How to Interpret the Numbers
Every measure of association has a “null value,” the number that means no relationship exists. For ratio measures like relative risk, odds ratios, and hazard ratios, the null value is 1.0. For difference measures like risk difference, and for correlation coefficients, it’s 0. Knowing the null value is the first step in reading any result: is the number above, below, or right at the null?
The confidence interval around a measure tells you how precise the estimate is. A narrow interval means the study had enough data to pin down the association tightly. A wide interval signals uncertainty. For ratios, if the 95% confidence interval doesn’t include 1.0, the result is statistically significant at the 0.05 level. For differences and correlations, significance requires that the interval excludes 0. But statistical significance alone doesn’t tell you whether an association is meaningful in practice.
That’s where effect size benchmarks come in. Jacob Cohen proposed widely used thresholds: small (0.2), medium (0.5), and large (0.8 or higher). As he described it, a medium effect is “visible to the naked eye of a careful observer,” a small effect is noticeably less but not trivial, and a large effect is unmistakable. These thresholds apply most directly to standardized mean differences, but the underlying principle, that a statistically significant result can still represent a trivially small association, applies across all measures.
Why Study Design and Confounding Matter
A measure of association is only as trustworthy as the study that produced it. The validity of any result depends heavily on whether the right measure was used for the right design. Using relative risk in a case-control study, for instance, would give a misleading answer because the study design doesn’t provide the denominator needed to calculate true risk.
Confounding is the other major threat. A confounding variable is a third factor that influences both the exposure and the outcome, creating a distorted association between them. The classic example: ice cream sales and drowning rates both increase in summer, but ice cream doesn’t cause drowning. Temperature is the confounder driving both. Confounders can inflate an association, making it look stronger than it is, or they can mask a real association entirely. A simple comparison between two variables can be “quite unrepresentative of the true causal connection,” as researchers at the University of Michigan’s data consortium put it. Adjusting for confounders through techniques like stratification or regression modeling is what separates a crude association from one that reflects something closer to reality.

