What Is Nonresponse Bias in Statistics?

Nonresponse bias occurs when people who don’t participate in a survey differ in meaningful ways from people who do, skewing the results. If a health survey gets responses mostly from people who are already engaged with the healthcare system, the findings will overrepresent their experiences and miss the perspectives of those who aren’t. The bias isn’t caused by having fewer responses. It’s caused by having the wrong mix of responses.

How Nonresponse Bias Works

Every survey starts with a target population: the full group of people you want to learn about. A sample is drawn from that population, and ideally everyone in the sample responds. In practice, some people refuse, can’t be reached, or simply ignore the request. That’s nonresponse. It only becomes nonresponse bias when the people who skip the survey are systematically different from those who complete it on the very things the survey is trying to measure.

The size of the bias depends on two factors working together: how many people didn’t respond, and how different those non-responders are from responders on key variables. You can express this relationship simply. The bias equals the nonresponse rate multiplied by the average difference between what responders reported and what non-responders would have reported. A survey with a 40% nonresponse rate but almost no difference between the two groups produces little bias. A survey with a 15% nonresponse rate where non-responders hold sharply different views can be substantially biased. This is why chasing a high response rate alone doesn’t guarantee accurate results.

Unit Nonresponse vs. Item Nonresponse

Nonresponse comes in two forms. Unit nonresponse happens when a person doesn’t participate in the survey at all. They never return the questionnaire, never pick up the phone, never click the link. The entire person is missing from the dataset.

Item nonresponse is more selective. A person completes most of the survey but skips certain questions. Income questions are a classic example: people participate willingly but leave the salary field blank. This creates gaps in specific variables rather than removing the person entirely. Both types can introduce bias if the missing data follows a pattern. If wealthier respondents skip income questions at higher rates, the income data left behind will underrepresent high earners.

Why Response Rates Are Misleading

It’s tempting to judge a survey’s quality by its response rate. A 70% response rate sounds more trustworthy than a 30% one. But decades of research show the relationship between response rates and actual bias is weak and inconsistent. A 20-year analysis of pediatrician surveys found that response rates declined steadily over time (averaging 56.2% overall), and while lower response rates were associated with age-related bias, the connection wasn’t straightforward. Offering a $2 incentive boosted response rates by nearly 9 percentage points but simultaneously shifted the respondent pool toward older pediatricians and away from female ones, introducing new demographic imbalances.

The lesson: a higher response rate doesn’t automatically mean less bias. What matters is whether the people who respond look like the people who don’t on the characteristics that matter most to the study.

A Real Example From Health Research

The Health Information National Trends Survey (HINTS), a major U.S. survey tracking how people seek and use health information, illustrates how nonresponse bias plays out in practice. When researchers compared HINTS estimates to benchmarks from other national surveys, they found that HINTS consistently overestimated how often people searched for health information. The reason: people who are already interested in health topics were more likely to complete a survey about health topics.

HINTS also reported that the percentage of people in “good” or “excellent” health was 11 percentage points lower than the same estimate from the National Health Interview Survey. Older individuals, less healthy individuals, and people actively looking for cancer information were overrepresented among HINTS respondents. A “level of effort” analysis, which tracks how answers change as harder-to-reach people are brought in, confirmed the pattern. Later responders were less likely to have searched for health information in the past year, suggesting the final published estimate was inflated by two to three percentage points.

In a COVID-19 prevalence study in England, researchers found that their survey was under-representing unvaccinated people and those in areas where COVID-19 was most common. The result: population prevalence was likely being underestimated, even after statistical adjustments.

How Researchers Detect It

You can’t directly measure what non-responders would have said, so researchers use indirect methods to look for warning signs.

Wave analysis. This approach divides respondents into groups based on how many reminders they needed before participating. The theory is that people who respond only after several nudges are more similar to non-responders than people who answered right away. If answers shift meaningfully across waves (for instance, if later waves are increasingly male or increasingly unvaccinated), that’s a signal the survey may be missing a specific type of person.
Benchmark comparisons. Researchers compare survey estimates against known population values from census data or other large, well-established surveys. If a health survey’s respondents report college degrees at a rate 15 points higher than the census, nonresponse bias is a likely explanation.
R-indicators. These are statistical measures that quantify how much respondents and non-respondents differ from each other based on characteristics available for both groups (like age, geography, or other administrative data). R-indicators go beyond raw response rates to give a more nuanced picture of whether the responding sample actually represents the target population.

Reducing Bias Before Data Collection

The most effective time to address nonresponse bias is before it happens, during survey design. A large series of experiments conducted during the UK’s REACT-1 COVID-19 study tested multiple strategies head to head. Changes to letter wording, timing, and number of reminders made only limited differences. Financial incentives, on the other hand, increased participation by up to 22.3 percentage points.

More importantly, incentives didn’t just boost the overall count. They disproportionately pulled in the people most likely to be missing: younger adults, people in more deprived areas, and unvaccinated individuals. After incentives were introduced, vaccination rates among respondents more closely matched the general population. This suggests that targeted incentives, offered selectively to groups with low response propensity, can directly reduce the systematic differences that cause nonresponse bias rather than simply inflating the response rate.

Correcting for Bias After the Fact

When prevention isn’t enough, researchers adjust the data statistically. The core idea behind most corrections is weighting: giving more influence to responses from underrepresented groups and less to overrepresented ones.

The simplest version divides respondents into subgroups based on characteristics like age, sex, or region, then weights each person’s response by the inverse of their subgroup’s response rate. If young men responded at half the rate of older women, each young man’s response counts twice as much in the final estimate. More sophisticated approaches use logistic regression to estimate each person’s probability of responding based on multiple characteristics simultaneously, then weight by the inverse of that probability. Other techniques, like calibration and raking, adjust the weights so that the final weighted sample matches known population totals across several dimensions at once.

These corrections help, but they have limits. They can only adjust for characteristics the researcher knows about and has data on. If non-responders differ from responders on unmeasured factors (like attitudes, health behaviors, or experiences the survey was specifically designed to capture), no amount of demographic weighting will fully fix the problem. This is why nonresponse bias remains one of the most persistent challenges in survey research, alongside measurement error, coverage error, and processing errors.