What Is a Correlational Study? Definition and Types

A correlational study is a type of research that measures whether two variables are related, and how strongly, without manipulating either one. Researchers observe things as they naturally occur and then use statistics to determine if changes in one variable tend to accompany changes in another. It’s one of the most common research designs in psychology, medicine, and social science because it allows researchers to study relationships that would be unethical or impractical to test through experiments.

How Correlational Studies Work

The basic structure is straightforward: a researcher picks two or more variables, collects data on each, and then calculates whether they move together in a predictable pattern. Nobody is assigned to a group. Nobody receives a treatment. The researcher simply measures what already exists in the real world and looks for associations.

For example, a study of 333 male students examined the correlation between time spent playing computer games and aggressive behavior. Researchers didn’t tell some kids to play more games and others to play fewer. They measured how much each student already played, measured their behavior, and checked whether the two variables were linked. Students who spent more time gaming did show more aggressive behaviors, but the study design alone couldn’t prove that gaming caused the aggression.

Data for correlational studies typically comes from surveys, questionnaires, existing records, or direct observation. The key feature is that researchers are recording what’s already happening rather than introducing a change and watching what follows.

Positive, Negative, and Zero Correlations

The relationship between two variables is expressed as a number called a correlation coefficient, represented by the letter “r.” This value ranges from −1 to +1, and it tells you two things: how strong the relationship is and which direction it goes.

Positive correlation (r = 0 to +1): Both variables move in the same direction. As one increases, the other tends to increase too. Height and shoe size is a classic example.
Negative correlation (r = 0 to −1): The variables move in opposite directions. As one increases, the other tends to decrease. Think of hours spent exercising and resting heart rate.
Zero correlation (r = 0): No predictable relationship exists between the two variables.

A perfect correlation of +1 or −1 almost never shows up in real-world research. Values closer to those extremes indicate a stronger, more reliable relationship, while values near zero suggest a weak one or no relationship at all. A correlation of 0.8 means the two variables track each other closely. A correlation of 0.2 means there’s a faint pattern, but plenty of individual variation.

Why Correlation Does Not Equal Causation

This is the single most important thing to understand about correlational research. Finding that two variables are related does not tell you that one causes the other. There are two specific problems that make causal claims impossible from correlational data alone.

The Directionality Problem

When two variables are correlated, you can’t tell which one is influencing the other. If a study finds a link between poor sleep and anxiety, it could mean poor sleep increases anxiety, or that anxiety disrupts sleep. The data looks identical in both cases. This is known as the directionality problem: you know the variables are related, but you don’t know which is the cause and which is the effect.

The Third Variable Problem

Sometimes two variables appear connected only because a hidden factor is driving both of them. The classic classroom example involves ice cream sales and shark attacks, which are positively correlated. Ice cream doesn’t attract sharks. Hot weather is the third variable: it sends more people to the beach (where sharks are) and also makes them buy ice cream.

Another good illustration: cities with more fire hydrants tend to have more dogs. The third variable is simply city size. Bigger cities have more of both. Without accounting for that hidden factor, you might draw a completely wrong conclusion about dogs and fire hydrants being meaningfully connected.

These alternative explanations are exactly what correlational studies cannot rule out. That’s not a flaw in the design so much as a trade-off. Correlational studies are useful for identifying patterns, but confirming cause and effect requires a different approach.

How Correlational Studies Differ From Experiments

The defining difference is manipulation. In a correlational study, a researcher observes naturally occurring variables. In an experiment, the researcher deliberately changes something and measures the result.

Say you want to know if watching violent television increases aggression in children. A correlational approach would survey children about their viewing habits and measure their aggressive behavior, then look for a statistical relationship. An experimental approach would randomly assign children to two groups, have one group watch violent content and the other watch neutral content, and then compare aggression between the groups.

That randomization step is critical. By randomly assigning participants, experiments ensure both groups are roughly equivalent at the start. Any difference that emerges afterward can be attributed to the variable the researcher changed. Correlational studies skip randomization entirely, which is why they can identify associations but can’t pin down causes.

This doesn’t make correlational research inferior. Many important questions can’t be studied experimentally. You can’t randomly assign people to smoke for 20 years. You can’t randomly assign income levels or childhood experiences. In these cases, correlational studies provide the only ethical path to understanding. Large-scale correlational research on smoking and lung cancer, for instance, was instrumental in establishing that link long before experimental mechanisms were fully understood.

When Researchers Use This Design

Correlational studies are especially valuable in three situations. First, when manipulating a variable would be unethical. Researchers studying the relationship between childhood trauma and adult mental health can’t expose children to trauma on purpose, so they measure both variables as they naturally occur. Second, when the research question involves variables that can’t realistically be controlled, like personality traits, genetic factors, or socioeconomic status. Third, when the goal is exploratory. Before investing in a costly experiment, researchers often run correlational studies to check whether a relationship even exists. If two variables show no correlation, there’s little reason to design an experiment testing whether one causes the other.

Correlational research also works well as a starting point that generates hypotheses for later experimental testing. A correlation between social media use and depression might prompt researchers to design an experiment where one group reduces screen time and another doesn’t, allowing a stronger causal claim.

How to Judge the Strength of a Correlation

Beyond the correlation coefficient itself, researchers look at statistical significance to determine whether a result is meaningful or likely due to chance. The standard threshold is a p-value below 0.05, meaning there’s less than a 5% probability that the observed correlation appeared by random chance alone. Some fields set stricter thresholds. Genetics research, for instance, often requires p-values below 0.00000001 to account for the massive number of comparisons being made.

But statistical significance doesn’t automatically mean practical significance. A correlation can be statistically significant (unlikely to be random) while still being so small that it has little real-world meaning. A correlation of r = 0.08 between two variables might reach statistical significance in a study of 10,000 people simply because of the large sample size, yet the actual relationship between the variables is negligibly weak. Both the strength of the correlation and its significance matter when interpreting results.

When reading about correlational findings, the most useful questions to ask are: how strong is the correlation, how large was the sample, and what third variables might explain the relationship? Those three checks will help you distinguish between a meaningful pattern and a statistical coincidence.