What Is H0 and H1 in Hypothesis Testing?

H0 and H1 are the two competing statements in hypothesis testing, a core method in statistics. H0 (the null hypothesis) claims there is no effect, no difference, or no relationship. H1 (the alternative hypothesis) claims there is one. Every statistical test is essentially a structured way of deciding whether your data gives you enough reason to reject H0 in favor of H1.

Think of it like a courtroom. H0 is the assumption of innocence: nothing is happening until proven otherwise. H1 is the accusation: something real is going on. The entire process is built around trying to disprove H0, not around directly proving H1.

What H0 and H1 Actually State

The null hypothesis (H0) is the default position. It assumes the status quo: no change, no difference between groups, no relationship between variables. It’s what you’d expect if the thing you’re studying had zero effect. For example, if you’re testing whether a tutoring program improves test scores, H0 says the program makes no difference.

The alternative hypothesis (H1, sometimes written as Ha) is your actual research claim. It states what you expect the data to show based on your question. In the tutoring example, H1 would say the program does improve test scores. H1 is the reason you ran the study in the first place.

Here are a few paired examples to make the pattern clear:

  • H0: There is no difference in salary between male and female factory workers. H1: Male factory workers have a higher salary than female factory workers.
  • H0: There is no relationship between height and shoe size. H1: There is a positive relationship between height and shoe size.
  • H0: On-the-job experience has no impact on the quality of a brick mason’s work. H1: The quality of a brick mason’s work is influenced by on-the-job experience.

Notice that H0 always uses language like “no effect,” “no difference,” or “no relationship.” When written as math, H0 always contains an equals sign (=, ≥, or ≤). H1 always contains an inequality (≠, >, or <).

How H0 and H1 Relate to Each Other

H0 and H1 are mutually exclusive, meaning only one can be true at a time. They’re also exhaustive, meaning together they cover every possible outcome. If H0 is wrong, H1 must be right, and vice versa. There’s no third option. This is what makes the framework work: by gathering evidence against H0, you’re automatically building a case for H1.

One important quirk of this system: data can provide evidence against H0 and in favor of H1, but never the reverse. You can reject H0 or fail to reject it. You never “accept” H0 or “prove” it true. Failing to reject H0 simply means you didn’t find strong enough evidence to rule it out.

How You Decide Between Them

The decision comes down to a number called the p-value. After running a statistical test, you get a p-value that tells you how likely your results would be if H0 were actually true. A very small p-value means your results would be extremely unlikely under H0, which is evidence that H0 is wrong.

Before running the test, you set a threshold called alpha (α). In most fields, the standard alpha is 0.05, meaning you’re willing to accept a 5% chance of being wrong when you reject H0. The decision rule is simple:

  • If the p-value is less than or equal to α, reject H0 in favor of H1.
  • If the p-value is greater than α, do not reject H0.

Some fields are moving toward stricter thresholds like 0.01, 0.005, or even 0.001 to reduce false findings, but 0.05 remains the most common starting point.

One-Tailed vs. Two-Tailed Tests

The way you write H1 determines whether your test is one-tailed or two-tailed, and this affects how the math works behind the scenes.

A two-tailed test is used when H1 simply says there’s a difference, without specifying a direction. For instance, H1 might state that the average scores of two groups are not equal (≠). You’re open to the possibility that either group could score higher.

A one-tailed test is used when H1 predicts a specific direction. For example, H1 might say that Group A scores higher than Group B (>), or that a drug lowers blood pressure (<). One-tailed tests are more powerful for detecting an effect in the predicted direction, but they completely ignore effects in the opposite direction. You'd choose a one-tailed test only when you have strong reason to expect the effect will go one way.

What Can Go Wrong

Because you’re making a decision based on probability, there are two ways to get it wrong.

A Type I error (false positive) happens when you reject H0 even though it’s actually true. In the courtroom analogy, this is convicting an innocent person. The probability of making this error equals your alpha level. With α = 0.05, you accept a 5% chance of a Type I error every time you run a test.

A Type II error (false negative) happens when you fail to reject H0 even though H1 is actually true. This is like letting a guilty person go free. The probability of a Type II error is called beta (β), and it depends on factors like your sample size, how large the real effect is, and how much variability exists in your data.

These two errors pull in opposite directions. Making it harder to reject H0 (using a stricter alpha) reduces Type I errors but increases Type II errors. You’re less likely to cry wolf, but more likely to miss something real.

Rejecting H0 Doesn’t Always Mean the Effect Matters

One common misunderstanding is that rejecting H0 means you’ve found something important. That’s not necessarily true. Statistical significance and real-world significance are different things.

A study on a children’s IQ intervention illustrates this well. With a small sample of just 4 children per group, the intervention would need to boost IQ by roughly 26.5 points to reach statistical significance. But with 900 children per group, an increase of only 1.38 points would be statistically significant. The math becomes more sensitive with larger samples, so it can detect tiny differences that have no practical value.

A real clinical example drives this home: imagine two cancer drugs both show statistically significant improvements in survival. Drug A extends survival by five years, while Drug B extends it by five months. Both allow you to reject H0, but only Drug A represents a clinically meaningful benefit. The p-value tells you whether a difference likely exists. It says nothing about whether that difference is large enough to care about.

This is why researchers increasingly report effect sizes alongside p-values, giving you a measure of how big the observed difference actually is, not just whether it clears a statistical threshold.