What Is OR in Statistics? Odds Ratio Explained

In statistics, OR stands for odds ratio. It’s a number that tells you how strongly an exposure or characteristic is associated with an outcome. An OR of 1 means there’s no association. An OR greater than 1 means the exposure is linked to higher odds of the outcome, and an OR less than 1 means it’s linked to lower odds. You’ll encounter odds ratios constantly in medical research, epidemiology, and any study that uses logistic regression.

How the Odds Ratio Works

To understand the odds ratio, you first need to understand odds. Odds aren’t the same as probability. If 20 out of 100 people develop a condition, the probability is 20/100, or 20%. The odds are 20/80, or 0.25, because you’re comparing the number who got the outcome to the number who didn’t.

The odds ratio compares the odds of an outcome in one group to the odds in another. If the odds of lung disease among smokers are 0.30 and the odds among non-smokers are 0.05, the odds ratio is 0.30 divided by 0.05, which equals 6.0. That means smokers have six times the odds of developing lung disease compared to non-smokers.

Calculating OR From a 2×2 Table

Most odds ratios are calculated from a simple four-cell table. Imagine you’re studying whether an exposure (like eating a certain food) is linked to an outcome (like getting sick). You sort people into four groups:

a: Exposed and got the outcome
b: Exposed and did not get the outcome
c: Not exposed and got the outcome
d: Not exposed and did not get the outcome

The formula is: OR = (a × d) / (b × c). A CDC epidemiology example illustrates this neatly: with values of a=100, b=1,900, c=80, and d=7,920, the odds ratio comes out to (100 × 7,920) / (1,900 × 80) = 5.2. People with the exposure had about five times the odds of the outcome.

Interpreting the Number

The value of the odds ratio tells you three things at a glance. An OR of exactly 1.0 means the exposure makes no difference to the odds of the outcome. An OR above 1.0 means the exposure is associated with increased odds. An OR below 1.0 means it’s associated with decreased odds, which could suggest a protective effect.

The further the OR is from 1.0 in either direction, the stronger the association. An OR of 3.5 represents a much stronger link than an OR of 1.2. Likewise, an OR of 0.3 indicates stronger protection than an OR of 0.8.

Confidence Intervals and Statistical Significance

An odds ratio by itself doesn’t tell you whether the association is statistically meaningful. That’s where the confidence interval comes in. A 95% confidence interval gives you a range of plausible values for the true OR. The critical question is whether that range crosses 1.0.

If the confidence interval includes 1.0 (for example, OR = 1.4 with a 95% CI of 0.9 to 2.1), you can’t confidently say the association is real. The true value might be 1.0, meaning no association at all. If the entire interval stays above 1.0 or entirely below it, the result is statistically significant at the 0.05 level.

For those working with the math directly, the standard error of the odds ratio is calculated on the log scale. You take the natural log of the OR, then estimate the standard error as the square root of the sum of the reciprocals of all four cell frequencies (1/a + 1/b + 1/c + 1/d). From there, you can build the confidence interval and convert it back to the regular scale.

Odds Ratio vs. Relative Risk

This is where many people get confused, and where real mistakes happen in published research. The relative risk (also called the risk ratio) compares the probability of an outcome between groups. The odds ratio compares the odds. When the outcome is rare, these two numbers are nearly identical. The CDC example above had an OR of 5.2 and a risk ratio of 5.0, because the outcome was uncommon.

But as the outcome becomes more common, the two diverge. The odds ratio always exaggerates the association compared to relative risk. When the relative risk is above 1.0, the OR will be even higher. When the relative risk is below 1.0, the OR will be even lower. A general rule is that when the outcome occurs in fewer than 10% of the unexposed group, the OR is a reasonable stand-in for relative risk. Above that threshold, it’s not.

One real-world example from published research showed an adjusted risk ratio of 1.5 (a 50% increase in risk), while the corresponding odds ratio was 2.3. If someone read that OR as a risk ratio, they’d think the risk more than doubled, when it actually only went up by half. In an even more extreme case, a stratified odds ratio of 6.26 corresponded to a risk ratio of just 1.48.

Why This Misinterpretation Is So Common

The problem often starts with logistic regression, one of the most widely used statistical tools in medical research. Logistic regression produces odds ratios, not risk ratios. Researchers use it because it handles multiple confounding variables well, but the output can be misleading when the outcome isn’t rare. About one-third of cohort studies in the medical literature use logistic regression to adjust for confounders, and 40% of those produce odds ratios that deviate more than 20% from the underlying risk ratio. Among randomized controlled trials that use logistic regression, roughly two-thirds produce odds ratios with that level of distortion.

This isn’t just an academic concern. When odds ratios are mistaken for risk ratios, it can influence treatment decisions and health policy. Both numbers are technically correct, but they answer slightly different questions, and conflating them overstates the size of the effect.

Where You’ll See Odds Ratios

Odds ratios are the standard measure in case-control studies, where researchers compare people who already have a condition to a similar group who don’t, then look backward at exposures. In this study design, you can’t calculate risk directly because you selected participants based on the outcome, so the odds ratio is the natural choice.

They also appear in any analysis that uses logistic regression, including cohort studies and clinical trials. When you read a medical paper reporting that a certain factor is “associated with 2.5 times the odds” of some outcome, that 2.5 is an odds ratio. If logistic regression was used in a study with a common outcome, keep in mind that the actual increase in risk is likely smaller than the OR suggests.

The Rare Disease Assumption

Epidemiologists refer to the “rare disease assumption” when using odds ratios as substitutes for risk ratios. The logic is straightforward: when very few people in the unexposed group experience the outcome, the odds and the probability are nearly the same number. If 3 out of 100 people get sick, the probability is 3/100 (0.03) and the odds are 3/97 (0.031). Those values are almost identical, so the odds ratio closely tracks the risk ratio.

In case-control studies specifically, the odds ratio can approximate the risk ratio without this rare disease assumption, but only when controls are selected in certain ways. If controls are sampled from the general population regardless of whether they experienced the outcome, the OR equals the risk ratio no matter how common the outcome is.