A confidence interval is a range of values that estimates where the true answer for an entire population likely falls, based on data from a sample. Instead of giving you a single number as your best guess, it adds a buffer above and below that number to account for the uncertainty that comes with not measuring everyone. A 95% confidence interval, the most commonly reported type, means that if you repeated the same study or survey 100 times, about 95 of those intervals would contain the true value.
How a Confidence Interval Is Built
Every confidence interval has two components: a point estimate and a margin of error. The point estimate is the single best guess from your data, like the average height of people in a sample or the percentage of voters who support a candidate. The margin of error is the buffer zone added in both directions to account for the fact that your sample isn’t a perfect mirror of the whole population.
So if a poll finds that 52% of voters favor a candidate with a margin of error of plus or minus 3 percentage points, the confidence interval runs from 49% to 55%. That range is the full confidence interval. The “plus or minus 3%” you hear on the news is the margin of error, which is only half the picture.
What “95% Confidence” Actually Means
This is where most people, including many trained researchers, get tripped up. It’s tempting to read a 95% confidence interval of 0.1 to 0.4 and say “there’s a 95% probability the true value is in this range.” That sounds right, but it’s technically wrong. The true value is a fixed number. It’s either inside the interval or it isn’t.
What 95% confidence really describes is the reliability of the method, not any single result. The statistician Jerzy Neyman, who invented the concept in 1937, defined it this way: a confidence procedure has a 95% confidence level if, in repeated sampling, 95% of the intervals it generates would contain the true value. Think of it like a factory that builds nets. If the factory’s nets catch the fish 95% of the time across thousands of throws, any individual net is either going to catch the fish or miss it, but you can trust the factory’s track record.
The practical difference matters less than you might think for everyday use. You can reasonably treat the interval as a plausible range for the true value. Just know that the “95%” refers to how often the procedure works over the long run, not to the probability that this particular interval hit its target.
Common Confidence Levels
While 95% is the standard, researchers sometimes use other levels depending on how much certainty they need. The three most common are:
- 90% confidence produces a narrower interval. You’re accepting a higher chance that the interval misses the true value in exchange for a more precise estimate.
- 95% confidence is the default in most scientific research, medical trials, and polling.
- 99% confidence produces a wider interval. You get more certainty that the true value is captured, but the range becomes less useful because it’s so broad.
The tradeoff is always the same: more confidence means a wider interval. You can be very sure the answer is “somewhere between 10 and 90,” but that’s not particularly helpful. Narrowing the range means accepting more risk that you’ve missed the mark.
What Makes an Interval Wider or Narrower
Three factors control how wide a confidence interval ends up being.
Sample size has the biggest practical impact. A larger sample gives you a better estimate of the population, which shrinks the interval. This is why national polls survey 1,000 or more people rather than 50. If you quadruple your sample size, you roughly cut the margin of error in half.
Variability in the data also matters. If the thing you’re measuring is all over the map (people’s incomes, for example, vary enormously), your interval will be wider than if you’re measuring something more uniform (like resting body temperature). More spread in the data means more uncertainty about where the true average sits.
The confidence level itself is the third factor. Bumping from 95% to 99% confidence widens the interval because you’re demanding a bigger safety net. Dropping to 90% narrows it.
How Confidence Intervals Connect to Statistical Significance
Confidence intervals and p-values are two ways of answering the same question: is the result meaningful, or could it be due to chance? A 95% confidence interval and a p-value of 0.05 are mathematically linked. If a 95% confidence interval for the difference between two groups doesn’t include zero (meaning “no difference”), the result is statistically significant at the 0.05 level. If the interval does include zero, it’s not.
Many researchers prefer reporting confidence intervals over p-values because intervals give you more information. A p-value tells you “yes, significant” or “no, not significant.” A confidence interval tells you the likely range of the effect, so you can judge whether it’s large enough to care about. A drug that lowers blood pressure by somewhere between 0.5 and 9 points tells a different story than one that lowers it by somewhere between 8 and 12 points, even if both results are statistically significant.
Confidence Intervals in Medical Research
Clinical trials rely heavily on confidence intervals to communicate results. When a study compares a new treatment to an existing one, the results are typically reported as a difference with an interval around it. For example, a trial might report that treatment A improved response rates over treatment B by 0.6% to 9% (95% confidence interval). That interval tells clinicians not just that the treatment worked, but how much benefit to realistically expect.
When the interval is wide, as in a small trial that reported a difference ranging from negative 7% to positive 47%, the results are too uncertain to draw strong conclusions. The treatment might be substantially better, slightly worse, or anything in between. Wide intervals in medical research are a signal that more data is needed.
Confidence Intervals in Polls and Surveys
Every time you see a political poll reported as “48% plus or minus 3 points,” you’re looking at a confidence interval, typically at the 95% level. That margin of error means the polling method, if repeated many times with different random samples of the same size, would produce intervals that capture the true level of support about 95% of the time.
This is why close elections are hard to call from polls alone. If candidate A polls at 49% and candidate B at 51%, both within each other’s margin of error, the confidence intervals overlap and neither candidate can be declared the clear leader based on that data. The poll isn’t saying the race is tied. It’s saying the sample isn’t large enough to distinguish between the two with confidence.
Small Samples Change the Math
When your sample is small (generally fewer than 30 observations) and you don’t know the true variability of the population, the standard approach to calculating intervals doesn’t quite work. Instead, statisticians use a slightly different distribution that has fatter tails, meaning it builds in extra uncertainty to compensate for the small amount of data. As sample size grows, this adjustment becomes negligible, and the two methods produce virtually identical intervals. For most people reading research, the key takeaway is simpler: small samples produce wide, uncertain intervals, and large samples produce narrow, more trustworthy ones.

