What Are the Steps of the Scientific Method?

The scientific method is a structured process for investigating questions about the natural world. It’s commonly presented as five to seven steps, starting with observation and ending with a conclusion, but in practice it’s more of a loop than a straight line. Here’s how each step works and why it matters.

Observation and Question

Every scientific investigation starts with noticing something. Maybe a doctor sees that patients taking a certain medication also report fewer headaches, or a biologist notices that plants near a factory grow slower than plants farther away. The observation itself doesn’t need to be dramatic. It just needs to spark a specific, answerable question: “Does this medication reduce headaches?” or “Does factory runoff slow plant growth?”

The key word is “specific.” A vague question like “Why do plants grow?” is too broad to investigate. A good scientific question narrows the focus to something you can actually test. This step also involves background research, where you look at what’s already known about the topic so you’re not repeating work that’s been done or asking a question that already has a clear answer.

Forming a Hypothesis

A hypothesis is an educated guess about the answer to your question. It connects what you already know with what you’ve observed, and it makes a prediction you can test. For example: “Plants exposed to factory runoff will grow 30% slower than plants given clean water.”

The single most important requirement for a hypothesis is that it must be falsifiable. That means it has to be possible, at least in principle, for an experiment to prove it wrong. If no experiment could ever disprove your idea, it’s not a scientific hypothesis. “Factory runoff slows plant growth” is falsifiable because you could run an experiment and find that it doesn’t. “Plants have a life force that wants them to grow” is not falsifiable because there’s no experiment that could measure or disprove a “life force.”

Researchers also set up what’s called a null hypothesis, which is essentially the opposite of their prediction. In this case, the null hypothesis would be: “Factory runoff has no effect on plant growth.” The experiment is then designed to see whether you can reject that null hypothesis.

Designing the Experiment

This is where planning gets detailed. A well-designed experiment isolates the thing you want to test and controls everything else. Three types of variables are central to this step:

Independent variable: the factor you deliberately change. In the plant example, this is the type of water (clean vs. runoff).
Dependent variable: the outcome you measure. Here, that’s plant growth.
Control variables: everything you keep the same across groups, like soil type, sunlight, temperature, and watering schedule.

You also need a control group, a set of subjects that doesn’t receive the experimental treatment, so you have a baseline for comparison. Without one, you can’t tell whether your results came from the thing you tested or from something else entirely.

One of the biggest threats to a good experiment is confounding variables, which are outside factors that could influence both the thing you’re testing and the outcome. If the plants near the factory also get less sunlight because the building casts a shadow, sunlight is a confounder. It could be the real reason growth is slower, not the runoff. Good experimental design either eliminates confounders or accounts for them in the analysis.

This step matters beyond the lab too. When the FDA oversees clinical trials for a new drug, researchers must define their study plan before collecting any data. That plan specifies who qualifies to participate, how many people are needed, how long the study lasts, what dosage is used, and how bias will be minimized. The logic is the same as the plant experiment, just scaled up.

Collecting and Analyzing Data

Once the experiment runs, you gather your measurements and look for patterns. Raw data on its own rarely tells you much. You need statistical analysis to determine whether the differences you see are real or just due to random chance.

The standard tool for this is the p-value, which measures the probability that your results would occur if the null hypothesis were actually true. Researchers typically set their threshold (called alpha) at 0.05 before starting the experiment. That means they’re willing to accept a 5% chance of being wrong. If the p-value comes in below 0.05, the result is considered statistically significant, and the null hypothesis is rejected. For studies where the stakes are higher, researchers sometimes use a stricter threshold like 0.01, meaning they want to be 99% confident rather than 95%.

A common misconception: the p-value is not the probability that your hypothesis is correct. It’s the probability that you’d see results this extreme (or more extreme) if nothing were actually going on. That distinction matters because a p-value of 0.02 doesn’t mean there’s a 98% chance the drug works. It means that if the drug truly had no effect, you’d only expect to see results like yours 2% of the time.

Drawing a Conclusion

After analysis, you determine whether the data supports your hypothesis or not. If the results are statistically significant and your experiment was well-controlled, you can reject the null hypothesis and conclude that your independent variable likely had a real effect. If the data doesn’t reach significance, you fail to reject the null hypothesis, which is not the same as proving it true. It simply means you didn’t find enough evidence to support your prediction.

Either outcome is useful. A “negative” result that fails to support the hypothesis still adds to the body of knowledge by ruling out one possible explanation.

Peer Review and Replication

Science doesn’t end when one team reaches a conclusion. Before results are published in a journal, other experts in the field review the work. They evaluate whether the experimental design was sound, whether the analysis was appropriate, and whether the conclusions actually follow from the data. Based on their assessment, they recommend that the journal accept, revise, or reject the paper. An editor makes the final decision.

Even after publication, the findings aren’t treated as settled fact until other researchers can replicate them. If an independent lab follows the same methods and gets the same results, confidence in the finding grows. If nobody can reproduce the results, the original conclusion is called into question. This self-correcting feature is one of the things that separates science from other ways of knowing.

Why the Process Isn’t Really Linear

Textbooks present the scientific method as a neat sequence: observe, hypothesize, test, analyze, conclude. In reality, the process is iterative. A surprising result during data collection might send you back to redesign the experiment. Your conclusion might raise a new question that’s more interesting than the one you started with. Testing an idea about one topic can lead to an unexpected observation that takes you in an entirely different direction.

Successive investigations of the same topic often circle back to the same core question but at deeper levels. Each cycle refines understanding, and useful ideas get built upon over time. The “steps” are real, but thinking of them as a rigid checklist misses how discovery actually happens.

Hypothesis vs. Theory vs. Law

These three terms are often confused, but they describe very different things. A hypothesis is an untested prediction about a specific experiment or observation. A scientific law is a single, proven statement about how the universe behaves, often expressed as an equation. Newton’s law of universal gravitation is a law because it’s one statement, one equation, proven across a wide range of situations.

A scientific theory is something much bigger. It’s an entire framework of laws, principles, and facts that together explain a broad area of science. The theory of evolution, the theory of general relativity, and germ theory are all collections of many proven statements organized into a self-consistent system. Calling something “just a theory” misunderstands the term. In science, a theory is the highest level of explanation, not a guess waiting to be confirmed.