What Is a Cluster Randomized Trial and How It Works

A cluster randomized trial is a type of experiment where groups of people, rather than individuals, are randomly assigned to receive different treatments or interventions. These groups, called clusters, can be hospitals, schools, villages, families, or medical practices. Outcomes are then measured either for the entire group or for a sample of individuals within each group.

How It Differs From a Standard Randomized Trial

In a traditional randomized trial, each person is individually assigned to either the treatment group or the control group. In a cluster randomized trial, the assignment happens one level up. An entire hospital might be assigned to use a new infection-prevention protocol, while another hospital continues with standard care. Everyone within that hospital receives the same approach.

This distinction matters because people within the same cluster tend to be more alike than people in different clusters. Patients at the same clinic share the same doctors, the same building, and often similar demographics. Students at the same school share teachers and resources. This built-in similarity has major consequences for how the trial is designed, powered, and analyzed.

Why Researchers Use Cluster Randomization

The most common reason is to prevent contamination, which happens when people in the control group accidentally receive elements of the intervention. If you’re testing whether a new hand-hygiene training program reduces infections in a hospital ward, you can’t train half the nurses on a ward and expect the other half to behave as if nothing changed. The trained nurses would influence untrained colleagues simply by working alongside them. Randomizing the entire ward solves this problem.

Logistics also play a role. It’s often simpler to deliver an intervention the same way to everyone in a location than to customize it person by person within the same setting. Some interventions only work at the group level in the first place: a new school curriculum, a community water fluoridation program, or a hospital-wide electronic health record system can’t meaningfully be given to some individuals and withheld from others in the same place.

Researchers also choose cluster designs to avoid what are called “disappointment effects.” If a trial is testing something participants clearly want, like a cash transfer program, individually randomizing people within the same community can create resentment or behavioral changes in the control group that distort results.

The Statistical Trade-Off

Cluster randomization comes with a cost: you need more participants to detect the same effect. Because people within the same cluster resemble each other, each additional person in a cluster provides less new information than a completely independent participant would. A trial of 500 people spread across 50 clinics gives you more statistical power than 500 people spread across 5 clinics, even though the total number of participants is identical.

Researchers quantify this similarity using something called the intraclass correlation coefficient, or ICC. The ICC ranges from 0 (people within a cluster are no more alike than people in different clusters) to 1 (everyone in a cluster has the exact same outcome). Even a small ICC, like 0.05, can substantially inflate the required sample size.

The key calculation is the design effect, which tells you how much larger your sample needs to be compared to a standard individually randomized trial. The formula is straightforward: 1 plus the average cluster size minus 1, multiplied by the ICC. So if your clusters average 50 people and the ICC is 0.05, the design effect is 1 + (49 × 0.05) = 3.45. You’d need roughly 3.5 times as many participants as you would in an individual trial. Ignoring this inflation leads to underpowered studies that may miss real effects.

Common Cluster Units

The choice of cluster depends on the research question and the level at which the intervention operates. Common units include:

Hospitals or hospital wards for infection control, safety protocols, or system-level changes
Primary care clinics for testing new care delivery models or screening programs
Schools or classrooms for educational or behavioral interventions in children
Villages or communities for public health campaigns, sanitation projects, or vaccination strategies
Families or households for interventions targeting home environments

Trials can also have multiple levels of clustering. A cardiovascular health trial, for example, might measure outcomes on individual patients who are nested within providers, who work in clinics, which sit inside hospitals. Researchers must decide which level to randomize at, and each choice brings different trade-offs in statistical power, cost, and risk of bias.

The Stepped-Wedge Variation

In a standard parallel cluster trial, some clusters get the intervention and others serve as controls for the entire study. A stepped-wedge design works differently: all clusters start in the control condition, and then groups of clusters cross over to the intervention at staggered time points until every cluster has received it.

This design has several practical advantages. Because every cluster eventually gets the intervention, it’s easier to recruit sites when stakeholders believe the treatment is beneficial. The staggered rollout also makes resource management more feasible, since you don’t need to launch everywhere at once. Each cluster serves as its own control (before versus after crossing over), which can improve statistical efficiency. Stepped-wedge designs are especially popular for evaluating health system changes that organizations plan to implement anyway.

Ethical Considerations

Cluster trials raise unique consent questions. In an individual trial, each person agrees to participate before being randomized. In a cluster trial, randomization often happens before individual participants are even identified. A hospital is assigned to a new protocol, and then patients show up over the following months.

The Ottawa Statement, a widely referenced set of ethical guidelines for cluster trials, addresses this directly. When consent can’t be obtained before randomization, researchers should seek it as soon as a participant is identified but before any study procedures or data collection begin. Importantly, a gatekeeper, such as a school principal or hospital administrator, can grant permission for their institution to participate, but this does not replace the need for individual informed consent from participants. Gatekeepers cannot provide proxy consent on behalf of the people in their cluster.

Known Risks of Bias

Cluster trials are more vulnerable to certain biases than individual trials. The most significant is recruitment bias: if researchers or clinicians know which clusters are assigned to which group (and blinding is often impossible when entire sites receive different protocols), they may consciously or unconsciously recruit different types of participants into intervention versus control clusters. A review of 24 cluster trials published in leading medical journals found that in trials where recruitment bias was possible, more than half showed evidence of differential recruitment rates between groups.

The same review found that seven studies could have recruited participants before cluster randomization but chose not to, unnecessarily increasing their risk of selection bias. Better design choices, like identifying and enrolling participants before revealing which cluster gets which treatment, can reduce this problem substantially.

Reporting Standards

Because cluster trials have features that standard trial reports don’t capture, the CONSORT guidelines (the international standard for reporting randomized trials) include a specific extension for cluster designs. Researchers are expected to report the rationale for choosing a cluster design, how clustering was accounted for in sample size calculations, how clustering was handled in the statistical analysis, and the flow of both clusters and individuals through the trial from assignment to final analysis. Reports should also specify the ICC used in planning and the number of clusters, not just the number of individual participants.