When Cluster Sampling Is and Isn’t Appropriate

Cluster sampling is appropriate when your target population is large, spread across a wide area, and you either lack a complete list of every individual or can’t practically reach a random selection of them. It’s the go-to method when traveling to every corner of a population would be too expensive or logistically impossible, but you can identify natural groupings (schools, neighborhoods, clinics) that collectively represent the whole.

Understanding when this approach makes sense, and when it doesn’t, comes down to a few practical and statistical factors.

No Complete List of Individuals Exists

One of the strongest reasons to use cluster sampling is that you don’t have a sampling frame listing every person in your population. Imagine trying to survey all seventh-graders in a large city. There’s no master roster of every student, but there is a list of every school. Cluster sampling lets you randomly select schools, then study the students within them. You’re working from the list you actually have (schools) rather than the one you don’t (individual students).

This situation is common in public health, education research, and social science. A complete list of primary sampling units, like villages, clinics, or households in a census block, is often available even when a complete list of individuals is not. Cluster sampling turns that limitation into a workable design.

The Population Is Geographically Spread Out

Cost and logistics are the most common practical reasons to choose cluster sampling. If your population spans hundreds of locations, sending data collectors to a random sample of individuals scattered across all of them is expensive and slow. Cluster sampling concentrates your fieldwork in a manageable number of locations.

The World Health Organization uses exactly this logic for vaccination coverage surveys worldwide. Their standard methodology, outlined in the 2018 WHO Vaccination Coverage Cluster Survey manual, selects clusters of communities rather than individual children scattered across entire countries. This makes large-scale health measurement feasible in settings where resources and infrastructure are limited.

Cluster sampling is also appropriate when the research question itself is geographical. If you’re studying health disparities between neighborhoods or how people in the same area influence each other’s behavior, selecting specific places as clusters aligns the sampling method with what you’re trying to learn.

How Clusters Should Look

Cluster sampling works best when each cluster is a miniature version of the whole population. Ideally, the people within any single cluster are diverse, reflecting the same range of characteristics you’d find across the entire group. Schools in a city, for instance, each contain students from various backgrounds, ability levels, and socioeconomic situations.

This is the opposite of what you’d want in stratified sampling, where you create groups of similar people. In cluster sampling, you want each group to be internally varied. The more alike people within a cluster are, the less new information each additional person adds, and the less precise your results become.

The Statistical Tradeoff You Accept

Cluster sampling is less statistically precise than simple random sampling with the same number of participants. The reason is straightforward: people who live in the same neighborhood, attend the same school, or visit the same clinic tend to resemble each other. That similarity means your data points aren’t fully independent, which reduces the effective power of your sample.

This similarity is measured by something called the intracluster correlation coefficient (ICC). It ranges from 0 to 1. An ICC of 0 means people within clusters are no more alike than people across the entire population, and your cluster design loses nothing compared to random sampling. An ICC of 1 means everyone in a cluster gives identical responses, and your effective sample size shrinks to just the number of clusters, no matter how many people you interview within each one.

In most real-world studies, the ICC is small but not zero. Even a small ICC, though, can meaningfully reduce your statistical power when clusters are large. This is captured by the design effect formula:

Design Effect = 1 + (n – 1) × ICC

Here, n is the number of individuals per cluster. If you sample 50 people per cluster and the ICC is 0.05, your design effect is 3.45. That means you’d need roughly 3.5 times as many total participants as you would with simple random sampling to achieve the same precision. You’re trading statistical efficiency for practical feasibility.

When Cluster Sampling Is Not Appropriate

If you already have a complete list of your population and can easily reach a random selection of individuals, simple random or stratified sampling will give you better precision for the same sample size. Cluster sampling introduces unnecessary imprecision when the logistical problems it solves don’t actually exist.

It’s also a poor fit when clusters are very homogeneous. If everyone within a given school, clinic, or village is similar on the trait you’re measuring, each cluster adds little new information. You’d need a very large number of clusters to compensate, which may erase the cost savings that motivated the design in the first place.

Cluster sizes that vary dramatically also create problems. When some clusters are much larger than others, using the average cluster size to plan your sample will underestimate how many participants you need. Researchers generally recommend adjusting for this when the variation in cluster size is large, specifically when the standard deviation of cluster size divided by the mean cluster size exceeds 0.23.

One-Stage vs. Two-Stage Designs

In one-stage cluster sampling, you randomly select clusters and then include every individual within each selected cluster. This is simpler to execute but can result in very large samples if clusters contain many people.

In two-stage cluster sampling, you randomly select clusters first, then randomly sample individuals within each chosen cluster. This gives you more control over your total sample size and is the more common approach in large surveys. Multi-stage designs extend this further, selecting progressively smaller units: first regions, then districts within regions, then households within districts.

The WHO vaccination surveys, for example, use a two-stage approach: first selecting clusters of communities, then sampling eligible children within those communities. This balances coverage across a country with a manageable workload at each site.

Practical Checklist

Cluster sampling is most appropriate when several of these conditions are true:

No individual-level sampling frame exists, but a list of groups or locations is available
The population is geographically dispersed, making it expensive to reach randomly selected individuals
Budget or time constraints require concentrating data collection in fewer locations
Natural groupings exist (schools, clinics, villages, city blocks) that are internally diverse
You can increase your total sample size to compensate for the design effect

When these conditions align, cluster sampling isn’t just appropriate. It’s often the only realistic way to study a large population with limited resources.