What Is Causal? Causation vs. Correlation Explained

Causal means that one event or factor directly produces another. When scientists or statisticians describe a relationship as causal, they’re saying that changing one thing actually brings about a change in something else, not just that the two things happen to move together. This distinction between true cause-and-effect and mere coincidence sits at the heart of how we understand everything from medical treatments to economic policy.

Causal vs. Correlated

Two things are correlated when they tend to rise or fall together in a measurable, predictable way. Correlation is expressed as a single number (called a correlation coefficient) that captures how closely two variables track each other. But that number says nothing about whether one variable actually caused the other to change.

Smoking and heavy drinking, for example, are correlated: people who smoke are more likely to drink. But smoking doesn’t cause alcoholism. On the other hand, smoking does cause an increased risk of lung cancer. The data patterns can look similar on a spreadsheet. The causal reality behind them is completely different. A causal relationship requires that removing or changing the cause would actually change the outcome. A correlation only tells you the two things showed up together in the data.

How Scientists Determine Causation

Proving that something is truly causal, rather than just associated, takes more than one study or one clever chart. In the 1960s, epidemiologist Austin Bradford Hill laid out a set of criteria that researchers still use to evaluate whether a link is likely causal. The most important of these include:

Temporality: The cause has to come before the effect. This sounds obvious, but with slow-developing diseases it can be surprisingly hard to sort out which came first.
Strength of association: The bigger the effect, the harder it is to explain away as coincidence. Lung cancer rates among heavy smokers aren’t just slightly elevated; they’re dramatically higher.
Dose-response: If more of the cause leads to more of the effect, that’s strong evidence. Lung cancer death rates rise in a near-straight line with the number of cigarettes smoked per day.
Consistency: The same result shows up across different studies, in different populations, using different methods.
Plausibility: There’s a reasonable biological or mechanical explanation for how the cause could produce the effect.
Experiment: When you remove the suspected cause, does the effect go away? If workers are exposed to a harmful dust and you reduce that dust, do disease rates fall?

No single criterion is enough on its own. Researchers weigh them together. The smoking-lung cancer link, for instance, satisfied virtually all of these criteria across decades of studies in animals and humans, in laboratories and communities, and in countries around the world. That’s what made it one of the most thoroughly established causal relationships in medicine.

The Counterfactual Test

At its core, causal thinking relies on a simple mental experiment: what would have happened if things had been different? This is called counterfactual reasoning. If a patient takes a medication and recovers, the causal question is whether they would have recovered without it. You can never observe both scenarios for the same person at the same time, which is exactly what makes causation so tricky to pin down.

This idea, imagining what would have happened under different conditions, actually inspired the invention of randomized experiments in the 1920s. By randomly assigning some people to get a treatment and others to get a placebo, researchers create two groups that are as similar as possible. Any difference in outcomes can then be attributed to the treatment itself rather than to some other factor. That’s why randomized controlled trials are considered the strongest tool for establishing causal links: random assignment balances out both known and unknown differences between groups, so the only systematic difference left is the thing being tested.

Three Levels of Causal Thinking

Computer scientist Judea Pearl, one of the most influential thinkers on causation, organized causal reasoning into three levels that build on each other.

The first level is association: simply observing patterns in data. A store notices that customers who buy toothpaste also tend to buy floss. This is pure correlation, and it requires no understanding of why.

The second level is intervention: asking what would happen if you actively changed something. What happens to headaches if you take aspirin? What happens to smoking rates if you ban cigarettes? This goes beyond observation because it involves predicting the consequences of an action. You can’t answer intervention questions from raw data alone, because people’s behavior changes when conditions change.

The third level is counterfactual: asking about events that didn’t happen. Was it the aspirin that stopped your headache, or would it have gone away on its own? Would a historical figure still be alive if a specific event hadn’t occurred? These questions require the deepest causal understanding because they involve reasoning backward about alternative realities. Each level requires more information than the one below it, and questions at a higher level genuinely cannot be answered using only lower-level data.

Common Mistakes in Causal Reasoning

The most famous error is called “post hoc ergo propter hoc,” a Latin phrase meaning “after this, therefore because of this.” It’s the assumption that because event B followed event A, A must have caused B. Aristotle identified this mistake over two thousand years ago, noting that politicians were especially prone to it: a leader would claim their opponent’s policy caused a war simply because the war came afterward.

This fallacy shows up in three common forms. The first is concluding that one specific event caused another just because it came first. The second is seeing two types of events happen in sequence repeatedly and assuming one causes the other. The third, and weakest, is seeing a single sequence (A happened, then B happened) and generalizing that A always causes B. All three mistakes confuse time order with causation.

In everyday life, these errors are easy to fall into. A new diet is followed by weight loss, so the diet “worked,” even though seasonal changes or reduced stress might explain it. A city installs speed cameras and accident rates drop, but they might have dropped anyway due to unrelated road improvements. Recognizing that sequence alone doesn’t prove causation is one of the most practically useful things about understanding what “causal” really means.

How Causal Evidence Is Used Today

In medicine and public health, establishing a causal link changes everything. Once smoking was shown to cause lung cancer (with an estimated 90% of certain lung cancer types in smokers attributable to smoking itself), governments had the justification to regulate tobacco. Causal evidence is what separates a health recommendation from a health hunch.

When randomized trials aren’t possible, because you can’t ethically assign people to smoke for 30 years, researchers increasingly use a framework called “target trial emulation.” The idea is to design an observational study so that it mimics a randomized trial as closely as possible. This approach, recently updated in the Annals of Internal Medicine, helps prevent the design flaws that creep in when analyzing existing health data and forces researchers to state their causal question clearly. As the authors put it, you can’t provide valid answers if you don’t know what the question is.

Whether you’re evaluating a news headline about a health risk, a marketing claim about a supplement, or a policy argument about crime rates, the question to ask is always the same: is this actually causal, or are two things just moving together? The answer depends on whether there’s a plausible mechanism, whether the evidence comes from studies designed to isolate cause and effect, and whether the relationship holds up across different conditions. That’s the difference between knowing that something is happening and knowing why.