What Is a True Experiment? Definition and Core Elements

A true experiment is a research design where the researcher deliberately changes one variable, randomly assigns participants to groups, and compares outcomes against a control group. These three elements, manipulation, random assignment, and control, are what separate true experiments from every other type of study. If any one of them is missing, the study falls into a different category.

The Three Core Elements

Every true experiment starts with manipulation: the researcher purposefully changes something in the environment to see what happens. The thing being changed is called the independent variable. The thing being measured as a result is the dependent variable. If you’re testing whether a new study technique improves test scores, the study technique is the independent variable and the test score is the dependent variable.

Random assignment means every participant has an equal chance of ending up in any group. This is the single most important feature that distinguishes a true experiment from other designs. It works because personal differences between participants, things like motivation, health, age, or background, get distributed roughly evenly across groups. No group is stacked with people who were already more likely to succeed or fail. As long as each group ends up with a similar mix of people who are pleased or displeased with their assignment, no overall pattern of preferences can explain away the results.

Control means the researcher prevents outside factors from influencing the outcome. At minimum, this involves a control group that doesn’t receive the experimental treatment. The control group acts as a baseline: if the experimental group improves but the control group doesn’t, you have stronger evidence that the treatment caused the change, not some unrelated event happening at the same time.

Why Random Assignment Matters So Much

Random assignment is what gives true experiments their power to show cause and effect. Without it, you can only show that two things are related, not that one caused the other. Consider a study comparing two teaching methods where students pick which class to join. Students who choose the more intensive method might already be more motivated, so any improvement in their scores could reflect motivation rather than the method itself. Random assignment eliminates that problem by making the groups equivalent before the experiment even begins.

This is the key distinction between a true experiment and a quasi-experiment. Quasi-experiments still involve manipulation and measurement, but they lack random assignment. A researcher studying the effect of a new hospital policy, for example, can’t randomly assign patients to hospitals. They have to compare hospitals that adopted the policy with those that didn’t, and any pre-existing differences between those hospitals become potential alternate explanations for the results.

How True Experiments Handle Bias

Even with random assignment, bias can creep in through the people running the study or the participants themselves. If participants know they’re receiving the experimental treatment, their expectations alone can change their behavior or how they report symptoms. This is the placebo effect. If researchers know which group a participant belongs to, they might unconsciously treat that person differently or interpret ambiguous results more favorably.

Blinding is the standard solution. In a single-blind study, participants don’t know which group they’re in. In a double-blind study, neither the participants nor the researchers collecting data know who received the treatment. Double-blinding minimizes observer bias (where researchers see what they expect to see) and confirmation bias (where they interpret data in ways that support their hypothesis). It also reduces the placebo effect by keeping participants genuinely uncertain about whether they received the real treatment.

What True Experiments Protect Against

Researchers worry about “threats to internal validity,” which is a formal way of asking: could something other than the independent variable explain the results? In 1957, psychologist Donald Campbell identified seven classic threats, and the design of a true experiment addresses most of them.

History refers to outside events that occur during the study. If a public health campaign launches in the middle of your drug trial, it could affect outcomes independently. A control group experiencing the same event helps you spot this. Maturation means participants naturally change over time, they get older, more tired, or simply more familiar with the testing process. Again, the control group matures at the same rate, so any difference between groups still points to the treatment.

Selection bias occurs when groups differ from the start. Random assignment directly neutralizes this. Attrition (sometimes called mortality in research terms) happens when participants drop out, and if dropouts are concentrated in one group, the remaining participants may no longer be comparable. Statistical regression is the tendency for extreme scores to move toward the average on retesting, which can mimic a treatment effect if you selected participants based on extreme scores. True experimental designs, by comparing equivalent groups over the same timeframe, control for all of these simultaneously.

A True Experiment in Practice

A study by researchers McCoy and Major illustrates how a true experiment works in psychology. They wanted to test whether reading about prejudice against your own racial group affects depression levels. Participants were randomly assigned to one of two groups. The experimental group read an article describing severe, pervasive prejudice against their own racial group. The control group read an article about prejudice against a different racial group.

After reading, both groups completed a depression measure. Those who read about prejudice targeting their own group reported higher depression scores than those in the control group. Because participants were randomly assigned (not self-selected based on existing sensitivity to prejudice), and because the only difference between groups was which article they read, the researchers could reasonably conclude that the content of the article caused the change in depression levels.

The Trade-Off With Real-World Relevance

True experiments are the gold standard for establishing cause and effect, but that precision comes at a cost. The tightly controlled conditions that make results trustworthy can also make them less applicable to everyday life. This tension is known as the trade-off between internal validity (confidence that the treatment caused the effect) and external validity (confidence that the finding applies outside the lab).

Laboratory studies of how drugs impair psychomotor performance, for instance, typically test relaxed, rested, healthy volunteers in a controlled environment. The demands that stressed patients face in daily life are very different, which means the lab results may not translate directly. A drug that causes mild impairment in a calm testing room might cause much more noticeable problems for someone juggling work, fatigue, and other medications.

There are also situations where true experiments simply aren’t possible. You can’t randomly assign people to smoke for 20 years, live in poverty, or experience childhood trauma. Ethical constraints mean that many of the most important questions in health and social science must be studied with observational or quasi-experimental methods instead. True experiments remain the strongest tool for causal claims, but they’re one tool among several, and recognizing what they can and can’t do is part of understanding how research works.

How Results Are Analyzed

Once a true experiment is complete, researchers use statistical tests to determine whether the difference between groups is meaningful or could have happened by chance. When comparing two groups (treatment vs. control), the most common approach is a t-test, which evaluates whether the average outcomes are far enough apart to be considered statistically significant. When an experiment involves three or more groups, researchers use a method called analysis of variance, or ANOVA, which compares all group averages simultaneously rather than testing each pair separately.

Both methods produce a p-value, a number representing the probability that the observed difference occurred by chance alone. A p-value below 0.05 is the conventional threshold, meaning there’s less than a 5% chance the result is a fluke. This doesn’t guarantee the treatment works in every case, but it provides a standardized way to distinguish real effects from random noise in the data.