An underpowered study has not gathered the necessary resources to reliably detect a true biological or medical effect. This usually means the sample size is too small for the research question. Such a study lacks the sensitivity to find an effect, even if that effect genuinely exists in the broader population. Statistical power is central to judging the trustworthiness of any scientific claim. When a study is underpowered, its findings, whether positive or negative, must be viewed with suspicion because they do not provide a firm basis for scientific conclusions.
Understanding Statistical Power and Study Errors
Statistical power is the probability that a study will correctly find an effect when that effect is truly present in the population being studied. Researchers typically aim for a power level of 80% or higher, meaning there is an 80% chance of successfully detecting a real phenomenon. The inverse of power relates directly to the probability of making a Type II error, which is a false negative result, occurring when a study fails to find a real effect.
The two main types of errors in hypothesis testing are Type I and Type II errors. A Type I error, or false positive, occurs when a study claims to have found an effect that does not actually exist. This is akin to a medical test incorrectly diagnosing a healthy person. Most research maintains a low threshold for this error, typically setting the risk at 5% or less.
An underpowered study is primarily characterized by an increased risk of a Type II error (false negative). For example, if a study has only 30% power, there is a 70% chance it will miss a real effect, even if the treatment is effective. This is problematic in early-stage research, where genuine medical breakthroughs could be overlooked due to a lack of statistical sensitivity.
Factors That Lead to Underpowered Research
The power of a study is determined by three primary factors: sample size, effect size, and data variability. Researchers must conduct a power analysis before starting a study to calculate the minimum sample size required to achieve the target power level. Insufficient sample size is the most frequent reason for a study being underpowered, as smaller groups introduce more random variation, making it difficult to distinguish a true signal from noise.
The expected effect size measures the magnitude of the difference or relationship the researchers hope to find. If the true effect is small, a much larger sample is required to reliably detect it than if the effect is large. Failing to accurately estimate a small effect size during planning often results in a study statistically incapable of finding the result.
High data variability also significantly reduces statistical power. Variability refers to how inconsistent the measurements are within the study groups. If the data is “noisy,” the true effect is easily obscured, necessitating a larger sample size to overcome the inconsistency.
Why Underpowered Studies Produce Unreliable Results
The most immediate consequence of low power is the increased chance of a false negative, leading to true medical associations being missed or dismissed. When a study reports a negative finding, a beneficial treatment may have gone undetected because the study lacked the necessary sensitivity. This wastes resources by prematurely abandoning promising lines of inquiry.
Studies that find a statistically significant result despite being underpowered often suffer from the “winner’s curse.” This means the reported effect size is likely an exaggeration of the true magnitude. Since only studies that observed an unusually large effect manage to achieve significance, their published results present a distorted, inflated view of the treatment’s benefit.
This overestimation of effect size severely hinders the reproducibility of the findings. Researchers attempting to replicate the study may use the exaggerated effect size to plan their power analysis, resulting in a study that is too small. This perpetuates a cycle of low power, making the original finding difficult to confirm and contributing to a lack of confidence in the scientific literature.
Interpreting Findings from Underpowered Research
When reviewing an underpowered study, the reader should exercise caution, especially with negative results. A failure to find an effect should not be taken as definitive proof that the effect does not exist. Instead, the finding should be interpreted as “the study was unable to detect a statistically significant effect,” leaving open the possibility of a true, undetected association. The absence of evidence should not be confused with evidence of absence.
Conversely, positive results from an underpowered study should be viewed as preliminary and requiring confirmation. If a small study reports a large, significant effect, it is prudent to suspect the effect size may be exaggerated due to the “winner’s curse.” The reported finding is a signal, not a final conclusion, and should be treated as a hypothesis needing validation through larger, well-powered replication studies.
For the public consuming scientific news, look for details about the study’s sample size and any discussion of statistical power. If a paper does not mention these elements, or if the sample size seems small for the complexity of the research question, the findings should be treated with skepticism. A study’s reliability depends more on its statistical power than on whether it achieved a statistically significant result.

