What Is an Interaction Variable in Statistics?

An interaction variable is a term in a regression model that captures how the effect of one predictor on an outcome depends on the value of another predictor. It’s created by multiplying two variables together and adding that product as its own term in the equation. Without it, a regression model assumes each variable’s effect is independent. With it, the model can detect that two variables work together in ways that neither one reveals on its own.

How an Interaction Variable Works

A standard regression equation with two predictors looks like this: Y = b0 + b1(X1) + b2(X2). Each variable has its own coefficient, and those coefficients don’t change regardless of what the other variable is doing. Adding an interaction variable creates a new term: Y = b0 + b1(X1) + b2(X2) + b3(X1 × X2). That product term, X1 × X2, is the interaction variable, and b3 is its coefficient.

The coefficient b3 tells you something specific: how much the relationship between X1 and the outcome changes for every one-unit increase in X2. If b3 is zero, the two variables don’t interact, and the simpler model without the product term would have been fine. If b3 is meaningfully different from zero, the effect of one variable genuinely shifts depending on the level of the other.

Here’s a useful way to think about it. In the interaction model, the slope for X1 is no longer a single fixed number. It becomes b1 + b3(X2). So the effect of X1 on the outcome is literally a function of X2. That’s what statisticians mean when they say “the marginal effect of X1 on Y depends on the value of X2.”

Interpreting the Coefficients

When an interaction term is in the model, the meaning of b1 and b2 changes in a way that trips up many people. The coefficient b1 no longer represents the overall effect of X1. Instead, it represents the effect of X1 specifically when X2 equals zero. The same applies to b2: it’s the effect of X2 only when X1 equals zero.

This matters because X2 = 0 might not be a meaningful value in your data. If X2 is age, for example, “age equals zero” is nonsensical. This is one reason researchers often center their variables (subtracting the mean) before creating the interaction term. Centering shifts the zero point to the average value, so b1 would then represent the effect of X1 at the average level of X2, which is far more interpretable.

The interaction coefficient b3 itself represents the difference in the effect of X1 on Y when comparing groups that differ by one unit on X2. If X2 is binary (like a yes/no variable), b3 captures the full difference in X1’s slope between the two groups.

A Medical Example

A clinical trial called REGRESS tested whether a cholesterol-lowering drug (pravastatin) slowed the narrowing of coronary arteries over two years. Overall, the drug worked: patients on pravastatin lost 0.060 mm less artery diameter than those on placebo. But that average masked a striking interaction.

Among patients who were also taking a calcium channel blocker (a common blood pressure medication), pravastatin’s benefit was large: 0.095 mm of preserved artery diameter. Among patients not taking a calcium channel blocker, the benefit was nearly zero at just 0.009 mm. The calcium channel blocker wasn’t effective on its own, and it wasn’t confounding the results (roughly equal numbers of patients in each group were taking it). But it dramatically changed how well pravastatin worked. That’s an interaction. Without testing for it, researchers would have reported only the modest average benefit and missed the fact that the drug was highly effective in a specific subgroup.

Categorical and Continuous Interactions

Interaction variables aren’t limited to two continuous predictors. Some of the most intuitive examples involve a categorical variable interacting with a continuous one. When that happens, the interaction tells you the slope of the continuous variable is different across groups.

Suppose you’re predicting writing test scores using a social studies score (continuous) and gender (categorical, coded as 0 or 1). Without an interaction, you’d get a single slope for how social studies scores relate to writing scores, shifted up or down by a constant amount for each gender. Adding the interaction lets each gender have its own slope. Maybe a one-point increase in social studies score is associated with a 0.5-point increase in writing scores for one group but a 0.3-point increase for the other. The interaction coefficient captures that 0.2-point difference.

When the categorical variable has more than two levels (say, three treatment groups), the interaction produces multiple coefficients, one for each comparison against the reference group. Each coefficient tells you how the slope of the continuous variable in that group differs from the slope in the reference group.

The Marginality Principle

There’s an important modeling rule when using interaction variables: always include the main effects of both variables alongside the interaction term. This is called the marginality principle, and it has been a standard guideline in statistics for decades. Statistician John Nelder described models that include interactions without their main effects as “of no practical interest.”

The reason is both mathematical and interpretive. Dropping a main effect forces part of that variable’s influence into the interaction term, distorting the coefficient. It also makes the interaction coefficient sensitive to arbitrary choices like how your variables are scaled or coded. Keeping both main effects in the model ensures the interaction term captures only the genuine joint effect of the two variables working together.

Visualizing Interactions

Interaction plots are the most common way to see what’s happening. You plot the outcome on the vertical axis and one predictor on the horizontal axis, then draw separate lines for different values of the second predictor (often high, medium, and low, or the levels of a categorical variable).

If the lines are parallel, there’s no interaction. The effect of the horizontal-axis variable is the same at every level of the other variable. If the lines have different slopes, an interaction is present. The two main patterns have specific names. In an ordinal interaction, the lines diverge or converge but don’t cross within the range of your data. One group always has higher values than the other, but the gap widens or narrows. In a disordinal (or crossover) interaction, the lines actually intersect within the observed data range, meaning the direction of the effect reverses. The REGRESS trial above is close to a crossover: pravastatin helped patients on calcium channel blockers but provided almost no benefit to those not taking them.

Testing Whether an Interaction Is Real

Adding an interaction variable to a model doesn’t mean the interaction is statistically meaningful. You need to test whether the interaction coefficient is significantly different from zero. This is typically done with a p-value for the interaction term. The null hypothesis is straightforward: the effect of one predictor is the same at every level of the other predictor. A small p-value provides evidence that the interaction is real and not just noise in the data.

In practice, interaction effects are harder to detect than main effects. They require larger sample sizes because you’re essentially looking for a difference in differences rather than a simple difference. Studies that are well-powered to detect main effects are often underpowered for interactions. If you suspect an interaction based on theory or prior evidence, plan for a larger sample than you’d need otherwise.

Common Pitfalls

Multicollinearity is a frequent concern. Because the interaction term is literally the product of two variables already in your model, it can be highly correlated with one or both of them. This inflates the uncertainty around your coefficient estimates. Centering variables before creating the interaction term has been recommended for nearly 30 years as a fix, and it often helps. But it doesn’t always reduce the problem, and in some cases it can actually make multicollinearity worse. The effectiveness depends on the specific structure of your data, so checking correlation diagnostics after centering is still important.

Another common mistake is testing many possible interactions without a clear hypothesis. With enough combinations, you’ll find spurious interactions by chance. The most trustworthy interaction effects are ones you predicted before looking at the data, grounded in a plausible reason why two variables would modify each other’s effects.