What Is the Law of Small Numbers? The Bias Explained

The law of small numbers is a cognitive bias where people mistakenly believe that a small sample of data will closely mirror the larger population it came from. Psychologists Amos Tversky and Daniel Kahneman coined the term in a 1971 paper, showing that even trained scientists fall into this trap: they expect patterns in tiny datasets that only reliably appear in large ones. The name is a deliberate play on the “law of large numbers,” a proven mathematical principle stating that averages do converge on the true value, but only as sample sizes grow very large.

How the Bias Works

Your brain uses shortcuts to judge probability, and one of the most powerful is the representativeness heuristic. You assess how likely something is by comparing it to a mental prototype of what that thing “should” look like. When applied to random data, this shortcut makes you expect small samples to be miniature copies of the whole population. A fair coin should land on heads 50% of the time, so after seeing three heads in a row, you feel like tails is “due.” That feeling is the law of small numbers at work.

Tversky and Kahneman identified several specific errors that flow from this single heuristic: insensitivity to sample size (treating 10 observations with the same confidence as 10,000), misconception of chance (expecting random sequences to “look” random even in short stretches), and the illusion of validity (feeling confident in a pattern spotted in limited data). Each of these is a different flavor of the same core mistake.

The Gambler’s Fallacy Connection

The gambler’s fallacy is one of the most familiar consequences of believing in the law of small numbers. After watching a roulette wheel land on red five times in a row, most people feel strongly that black is coming next. The reasoning is that a short run of outcomes “should” balance out to reflect the 50/50 odds of the wheel. But the wheel has no memory. Each spin is independent, and the probability of black on the next spin is exactly what it always is.

Tversky and Kahneman explained this by noting that people expect short sequences to share the statistical properties of long sequences. When asked to write down a fake series of coin flips, most people switch between heads and tails far more often than real randomness would. They keep the proportion close to 50/50 in every short stretch, because a run of five heads “doesn’t look random” to them, even though real coins produce streaks like that regularly.

The Hot Hand in Basketball

For decades, the “hot hand” in basketball was considered a textbook example of the law of small numbers. A 1985 study found no statistical evidence that players who had just made several shots in a row were more likely to make the next one. Fans and players, the argument went, were seeing patterns in small streaks that didn’t exist.

That story got more complicated. A 2018 paper in Econometrica proved that the original study’s statistical method contained a subtle but substantial bias. When you select sequences based on streaks (looking at what happens after a player makes three shots in a row, for example), the math itself introduces a downward bias that hides real effects. The bias gets worse with shorter sequences and longer streaks, exactly the conditions of the original study. After correcting for this, the researchers found evidence that the hot hand may be real after all. The irony is rich: the very paper used for decades to illustrate how people see patterns in small samples may itself have been fooled by small samples.

Why Small Numbers Mislead in Health Data

Small populations create statistical mirages that can cause real public alarm. The Washington State Department of Health has documented how disease rates in small towns fluctuate wildly from year to year, not because anything changed in the environment, but because a single case or two can double or triple the rate when the population is tiny. A rural county with 5,000 people that sees 3 cancer cases one year and 6 the next has technically experienced a 100% increase, but the numbers are so small that the jump is meaningless noise.

This is why “cancer cluster” investigations so often turn up empty. Residents and local officials see a rate that looks alarming compared to the national average, but the apparent spike is exactly what you’d expect from random variation in a small group. The law of small numbers bias makes these clusters feel meaningful because our brains assume the small local sample should match the larger population rate. When it doesn’t, we look for a cause, even when the real explanation is just math.

The Replication Crisis in Science

The law of small numbers isn’t just a problem for gamblers and basketball fans. It sits at the heart of one of modern science’s biggest embarrassments: the replication crisis. Large-scale projects attempting to reproduce published findings in psychology and social science have found that roughly 40% to 60% of studies fail to replicate. One major replication effort covering 21 social science studies found that the average effect size in the replications was barely half the size originally reported.

Low statistical power is a central culprit. When researchers run studies with too few participants, they need to get lucky to find a statistically significant result. The ones who do get lucky publish their findings. The ones who don’t file their results away. This filtering process, combined with practices like tweaking analyses until something crosses the significance threshold, means the published literature overrepresents fluky results from small samples. Researchers, like everyone else, fall prey to the belief that their small study sample is representative enough to draw firm conclusions.

How Data Scientists Handle Small Samples

In machine learning and predictive modeling, small datasets create a specific technical problem called overfitting. A model trained on too little data will learn the noise and quirks of that particular sample rather than the real underlying patterns. It performs beautifully on the training data and poorly on everything else.

Practitioners counter this in several ways. The most straightforward is simply collecting more data. When that’s not possible, cross-validation helps: the dataset is split into multiple parts, and the model is trained and tested on different combinations of those parts to get a more honest estimate of how well it actually works. For very small datasets, a nested version of this approach avoids wasting precious data points on validation alone. Another strategy is deliberately using simpler models. A model with fewer moving parts is less likely to latch onto random noise, which makes it more accurate when data is scarce, even though it captures less detail.

A Different “Law of Small Numbers” in Math

There’s a second, unrelated use of the phrase in pure mathematics. The Poisson law of small numbers, sometimes called the law of rare events, describes what happens when you have a very large number of opportunities for something to happen, but the probability on each opportunity is very small. Under those conditions, the total count of events follows a predictable pattern regardless of the specific details. This is why the number of typos per page, the number of car accidents at an intersection per month, or the number of radioactive decays per second all follow the same type of statistical distribution. If you searched specifically for this mathematical concept, it’s worth knowing it shares a name with the cognitive bias but describes something entirely different.