What Is a Meta-Analysis and How Does It Work?

A meta-analysis is a statistical method that combines the results of multiple independent studies on the same question to produce a single, stronger conclusion. Instead of relying on one study with 100 participants, for example, a meta-analysis might pool data from eight studies totaling 860 participants, treating them as one large sample. This gives the analysis far more statistical power to detect real effects than any individual study could achieve on its own.

In evidence-based medicine, meta-analyses sit at the very top of the evidence pyramid, above randomized controlled trials, observational studies, and expert opinion. That ranking reflects their ability to minimize bias and deliver the most decisive conclusions available, which is why clinical guidelines rely heavily on them.

How It Differs From a Systematic Review

The terms “systematic review” and “meta-analysis” often appear together, but they’re not the same thing. A systematic review is the broader process: researchers define a question, search the literature comprehensively, screen studies for relevance, and assess quality. It’s a structured, transparent way of surveying everything that’s been published on a topic.

A meta-analysis is the optional statistical step that can happen within a systematic review. When included studies report data in comparable formats and use similar measurement scales, researchers can calculate a weighted pooled estimate, a single number that represents the combined effect across all the studies. When the studies are too different to combine mathematically, the systematic review presents findings in tables or descriptive summaries instead. That’s called a qualitative review. So every meta-analysis lives inside a systematic review, but not every systematic review includes a meta-analysis.

The Step-by-Step Process

Conducting a meta-analysis follows a defined sequence. It starts with developing a clear research question and eligibility criteria: what types of studies count, what populations they must include, and what outcomes matter. From there, researchers build a comprehensive search strategy across multiple databases to find every relevant study, published or unpublished.

Next comes screening. All search results are imported into reference management software, and researchers review titles, abstracts, and full texts to identify studies that meet the eligibility criteria. Once the final list of included studies is set, data extraction begins. Researchers pull the key numbers, study characteristics, and quality indicators from each paper.

After assessing study quality, the statistical pooling happens. Researchers choose an appropriate model, input the extracted data into statistical software, and the software generates the results, typically displayed as a forest plot. The final steps involve assessing the overall certainty of the evidence, interpreting the results, and drawing conclusions.

Reading a Forest Plot

The forest plot is the signature visual output of a meta-analysis, and understanding it makes the whole concept click. Each horizontal line on the plot represents one study. At the center of that line sits a square, which marks the study’s estimated effect. The size of the square matters: larger squares mean the study carried more weight in the analysis, usually because it had more participants or more precise results.

The horizontal line extending through each square shows the confidence interval, essentially the range within which the true effect likely falls. A short line means the study produced a precise estimate. A long line means more uncertainty.

At the bottom of the plot sits a diamond. This is the overall pooled result, the combined conclusion from all included studies. The center of the diamond is the best estimate of the true effect, and the width of the diamond represents the confidence interval for that combined result. A narrow diamond means the evidence points in a fairly precise direction. A wide one means there’s still meaningful uncertainty even after combining everything.

How Results Are Measured

Meta-analyses express their findings using standardized effect size measures, and the choice depends on the type of data. When studies compare outcomes between two groups (like a treatment group and a placebo group), the results might be reported as an odds ratio or a risk ratio, both of which capture how much more or less likely an outcome is in one group compared to the other. When studies measure something on a continuous scale, like pain scores or blood pressure, the pooled result is often reported as a mean difference.

Some of these measures need mathematical transformations before they can be properly combined. Odds ratios, for instance, are typically converted to a logarithmic scale before pooling because this makes their statistical behavior more predictable. Correlation coefficients get a similar treatment. These are technical details handled by the software, but they’re part of why meta-analyses require statistical expertise to conduct properly.

Fixed-Effects vs. Random-Effects Models

One of the key decisions in any meta-analysis is which statistical model to use. A fixed-effects model assumes that every included study is estimating the exact same underlying effect, and any differences between results are just due to chance. A random-effects model assumes the true effect might genuinely vary from study to study, perhaps because of differences in populations, settings, or how treatments were delivered.

The random-effects model is more commonly appropriate because it accounts for this real-world variability between studies. However, when there are very few studies available, estimating that between-study variability becomes unreliable, and a fixed-effects model may be the better choice. A fixed-effects model also gives proportionally more weight to larger, higher-quality studies, which can be an advantage when the included studies vary significantly in size and rigor.

How Publication Bias Is Detected

One of the biggest threats to a meta-analysis is publication bias: the tendency for studies with positive or dramatic findings to get published while studies with null or negative results sit in a file drawer. If a meta-analysis only captures published studies, its pooled result can be misleadingly optimistic.

Researchers check for this using a funnel plot. Each study is plotted with its effect size on the horizontal axis and a measure of its precision (typically standard error) on the vertical axis. In the absence of bias, the plot should look like a symmetrical inverted funnel. Large, precise studies cluster near the top, and smaller studies scatter more widely but evenly on both sides. When the funnel looks lopsided, with small studies clustering on one side, that asymmetry suggests certain results may be missing from the literature. Standard error works best for the vertical axis because the expected shape without bias is cleanly symmetrical, straight lines can mark 95% confidence boundaries, and the plot naturally highlights the smaller studies most prone to bias.

Reporting Standards

To keep meta-analyses transparent and reproducible, the research community follows the PRISMA guidelines (Preferred Reporting Items for Systematic Reviews and Meta-Analyses). First published in 2009 and updated in 2020, PRISMA provides a 27-item checklist covering everything from how studies were identified and selected to the characteristics of included studies and the statistical results. It also includes standardized flow diagrams showing how many studies were found, screened, excluded, and ultimately included. The goal is straightforward: readers should be able to see exactly what was done, why, and how, so they can judge the quality of the evidence for themselves.

Strengths and Limitations

The core advantage of a meta-analysis is statistical power. By pooling participants across studies, researchers can detect effects that individual studies were too small to find on their own. This is especially valuable when studying rare outcomes or modest treatment effects. Meta-analyses also provide a single, quantified summary of the best available evidence, which is more useful for decision-making than trying to mentally weigh a dozen studies with conflicting results.

The most important limitation is often summarized as “garbage in, garbage out.” A meta-analysis is only as good as the studies it includes. If the underlying trials were poorly designed or biased, pooling them together doesn’t fix those problems. It amplifies them. Another common pitfall is combining studies that are too different from each other, sometimes described as comparing apples to oranges. If one study tested a drug in elderly patients and another tested a different dose in young adults, combining their results into a single number can be misleading rather than informative. Researchers assess this heterogeneity statistically, but judgment calls about which studies are similar enough to combine remain a source of legitimate debate.