What Is a Null Hypothesis? Definition and Examples

A null hypothesis is a starting assumption that there is no effect, no difference, or no relationship between the things you’re studying. It’s the default position in any statistical test: nothing interesting is happening until the data proves otherwise. Every experiment has one, whether the researchers are testing a new drug, comparing teaching methods, or measuring whether a fertilizer helps plants grow.

The concept works like a courtroom. Just as a defendant is presumed innocent until proven guilty, the null hypothesis presumes “no effect” until the evidence is strong enough to overturn that presumption.

How the Null Hypothesis Works

The null hypothesis (written as H₀) always contains an equality statement. It says two groups are the same, a treatment has zero effect, or a variable has no relationship with an outcome. In a clinical trial comparing a new cancer drug to the standard treatment, the null hypothesis would be: “There is no difference in cure rates between the two drugs.” Mathematically, that’s expressed as the difference between the two groups equaling zero.

Paired against it is the alternative hypothesis (H₁ or Hₐ), which is what the researcher actually believes or hopes to demonstrate. Counterintuitively, what you’re trying to prove gets the label “alternative,” while the opposite gets treated as the default. The alternative hypothesis in that same drug trial would be: “There is a difference in cure rates between the two drugs.”

The entire experiment then becomes a question of evidence. You collect data and run a statistical test to see whether the results are unusual enough to reject the null hypothesis. If they are, you conclude that the effect is real. If they aren’t, the null hypothesis stands.

The Role of the P-Value

Researchers decide whether to reject the null hypothesis using something called a p-value. This number represents the probability of getting results as extreme as (or more extreme than) what was observed, assuming the null hypothesis is actually true. A small p-value means the data would be very unlikely if there truly were no effect, which counts as evidence against the null.

Before running the experiment, researchers set a threshold called the significance level, usually 0.05 (or 5%). If the p-value comes in at or below that threshold, the null hypothesis is rejected. If the p-value is above it, the null hypothesis is not rejected. The 0.05 cutoff isn’t magic. It traces back to the 1920s, when the statistician Ronald Fisher needed a practical benchmark. The value 0.05 happens to correspond roughly to the probability of a result falling more than two standard deviations from the average in a normal distribution, which made it convenient for hand calculations in an era before computers.

Why You “Fail to Reject” Instead of “Accept”

One of the most important (and most misunderstood) points in statistics: when the data doesn’t meet the threshold, you say you “failed to reject” the null hypothesis. You never say you “accepted” it. The distinction matters because a statistical test can never prove that there is truly zero effect. To prove a value is exactly zero, you would need perfectly unbiased data with infinite precision. That’s impossible in practice.

A non-significant result might mean the null hypothesis is correct, or it might simply mean your study didn’t have enough participants or enough statistical power to detect a real effect. This is the origin of the phrase “absence of evidence is not evidence of absence.” Just because a study didn’t find a statistically significant difference doesn’t mean no meaningful difference exists in the real world.

Type I and Type II Errors

Because hypothesis testing relies on probability, mistakes are always possible. These fall into two categories:

  • Type I error (false positive): You reject the null hypothesis even though it’s actually true. In the courtroom analogy, this is convicting an innocent person. The probability of this happening equals your significance level. At the standard 0.05 threshold, you have a 5% chance of a false positive on any single test. That probability climbs quickly when you run multiple tests. Running 20 tests at once pushes the chance of at least one false positive to roughly 64%.
  • Type II error (false negative): You fail to reject the null hypothesis even though it’s actually false. This is the equivalent of letting a guilty person go free. The probability of this error depends heavily on sample size and how large the true effect is. Bigger studies with larger effects are less likely to miss what’s really there.

Increasing your sample size is the most reliable way to reduce both types of errors. Larger samples more closely represent the true population, making it less likely that random variation will mislead the results.

A Concrete Example

Say a hospital wants to know whether giving patients fresher blood transfusions reduces mortality compared to standard-issue blood. The null hypothesis is: “There is no difference in mortality between patients receiving fresh red blood cells and those receiving standard-issue red blood cells.” The alternative hypothesis is: “There is a difference in mortality between the two groups.”

Researchers then enroll patients, randomly assign them to one group or the other, and compare outcomes. If the p-value comes back at 0.03, that’s below the 0.05 threshold, so they reject the null hypothesis and conclude there’s a statistically significant difference. If the p-value is 0.22, they fail to reject the null and cannot conclude that fresh blood makes a difference, at least not based on this data.

Limitations Worth Knowing

The null hypothesis framework has drawn serious criticism in recent years. One major issue is that statistical significance was never meant to equal real-world importance. A drug trial with 100,000 participants might detect a statistically significant difference so tiny it has no practical clinical value. As sample sizes grow larger, even trivial differences become “significant” in the statistical sense. Conversely, small studies can miss effects that genuinely matter.

This has led to growing calls to move beyond the rigid p < 0.05 cutoff. In 2015, the journal Basic and Applied Social Psychology banned null hypothesis significance testing entirely, calling the 0.05 bar “too easy to pass” and “sometimes an excuse for lower quality research.” In 2019, The American Statistician published a special issue urging researchers to stop using the phrase “statistically significant” altogether, arguing it was never meant to imply that a finding was contextually important.

None of this means the null hypothesis is useless. It remains the backbone of how most scientific studies are designed and analyzed. But understanding its limits helps you read research results more critically, especially when a study claims to have found (or not found) an effect based on whether a single number crossed an arbitrary line.