How to Make a Forest Plot: Excel, R, and More

A forest plot is a graph that displays results from multiple studies side by side, then combines them into a single pooled estimate. You need just four columns of data to build one: a study label, an effect estimate, and the lower and upper bounds of the confidence interval. Once you have that, you can create the plot in Excel, R, or dedicated meta-analysis software in a matter of minutes.

What a Forest Plot Actually Shows

Each study appears as a single row on the plot. A square marks the study’s point estimate, which is the best guess of the true effect in that study’s population. A horizontal line extends through the square on both sides, representing the 95% confidence interval. The wider that line, the less precise the study’s estimate. The size of the square is proportional to the study’s weight: larger studies that contribute more information get bigger squares.

At the bottom sits a diamond. This represents the overall pooled effect from all included studies. The center of the diamond is the combined point estimate, and its width shows the confidence interval for that overall effect. Because it draws on all the studies together, the diamond is almost always narrower than any individual study’s confidence interval.

A vertical reference line (sometimes called the “line of no effect”) runs through the plot at 0 for mean differences or at 1 for ratios like risk ratios and odds ratios. If a study’s confidence interval crosses that line, the result is not statistically significant. If the diamond crosses the line, the overall pooled result is not significant either.

Data You Need Before Starting

Your spreadsheet or data file needs one row per study with these columns:

Study label: typically the first author’s name and publication year (e.g., “Smith 2019”).
Effect estimate: the numeric result, such as a risk ratio, odds ratio, or mean difference.
Lower confidence interval bound: the lower limit of the 95% CI.
Upper confidence interval bound: the upper limit of the 95% CI.
Weight (optional): a percentage or numeric value that determines how large each square appears. Most software calculates this automatically from the study’s sample size and variance.

If you’re working from published papers that only report a point estimate and a p-value, you’ll need to back-calculate the confidence interval or contact the study authors. The plot can’t be drawn without interval bounds.

Choosing the Right Scale

The type of effect measure determines your x-axis scale. When your outcome is a mean difference (continuous data), use a standard linear scale with the reference line at 0. When your outcome is a ratio measure like a risk ratio or odds ratio, use a logarithmic scale with the reference line at 1. The log scale matters because ratios are asymmetric on a linear axis: a risk ratio of 0.5 (halving risk) and 2.0 (doubling risk) represent the same magnitude of effect in opposite directions, but on a linear scale they’d look unequal. A log scale spaces them symmetrically.

Most dedicated software handles this automatically. In Excel, you’ll need to set this up yourself.

Building a Forest Plot in Excel

Excel doesn’t have a forest plot chart type, but you can build one from a scatter plot with custom error bars. Here’s the process:

Set up your spreadsheet with columns for study name, effect estimate, lower CI bound, and upper CI bound. Then create two additional columns: one for the distance from the point estimate down to the lower bound, and one for the distance up to the upper bound. These become your error bar values. Assign each study a numeric Y-axis value (1, 2, 3, etc.) so the studies stack vertically.

Insert a scatter plot. Your X values are the effect estimates and your Y values are the row numbers. Then add custom horizontal error bars: in Excel’s Layout tab (or by right-clicking the data series), choose “Error Bars,” select “Custom,” and assign your lower-distance column to the negative values and upper-distance column to the positive values. Remove the vertical error bars entirely.

Format the plot by replacing the Y-axis numeric labels with your study names, adding a vertical reference line at 0 or 1, and removing gridlines. You won’t get automatic weight-based square sizing in Excel, so the squares will all be uniform unless you overlay additional data series with varying marker sizes. It’s workable for simple plots but tedious for anything complex.

Building a Forest Plot in R

R is the most common tool for forest plots because packages handle the formatting automatically. The two main options are the metafor package and the forestplot package.

With metafor, you first fit a meta-analysis model using the rma() function, then pass the result directly to forest(). A minimal example looks like this:

library(metafor)
res <- rma(yi, sei=sei, data=mydata)
forest(res)

That single forest() call generates a complete plot with study labels, point estimates, confidence intervals, weights, and the summary diamond. The function accepts dozens of optional arguments for customization. Setting showweights=TRUE adds a column displaying each study’s percentage weight. The order argument lets you sort studies by effect size (order="obs"), precision (order="prec"), or fitted values (order="fit"), among other options. The header=TRUE default adds column headings automatically.

For ratio measures, use the atransf=exp argument so the axis labels display on the original ratio scale while the underlying calculations stay on the log scale. This keeps the confidence intervals visually symmetric. The refline argument controls where the vertical reference line falls (default is 0, so change it to refline=0 for log-transformed ratios, since log(1) = 0).

The standalone forestplot package is an alternative if you’re not running a full meta-analysis model. It works directly from a data frame of estimates and CIs, giving you more control over layout and text columns but requiring more manual setup.

Adding Heterogeneity Statistics

A forest plot typically reports heterogeneity metrics below the diamond to tell readers how consistent the results are across studies. The most common metric is I-squared, which describes the percentage of variability in results that comes from real differences between studies rather than chance.

The thresholds are roughly: 5% to 20% indicates low heterogeneity (the studies largely agree), 60% to 75% indicates moderate heterogeneity, and 80% to 97% indicates high heterogeneity. When I-squared is high, the pooled estimate in the diamond is less reliable because the underlying studies are measuring meaningfully different effects. Most software places the I-squared value, along with a chi-squared test result, in the bottom margin of the plot automatically.

Formatting for Publication

The PRISMA 2020 guidelines, which govern how systematic reviews are reported, require authors to describe the graphical methods used and the data presented. In practice, this means your forest plot should clearly label which effect measure it displays, include annotations showing each study’s numeric estimate and confidence interval, and identify the model used for pooling (fixed-effect or random-effects).

Ordering studies thoughtfully makes a difference. Alphabetical ordering by author name is the most common approach but also the least informative. Ordering by effect size, study weight, or risk of bias can reveal patterns that alphabetical sorting hides. For instance, sorting from lowest to highest risk of bias puts the most trustworthy evidence at the top, where readers look first.

Label each side of the x-axis with a “favors” direction (e.g., “Favors treatment” on the left, “Favors control” on the right). This is easy to forget and surprisingly common to omit, but without it, readers have to figure out the directionality themselves. Include the number of participants or events for each study in an adjacent column when possible, since this helps readers gauge why certain studies received more weight.

Common Mistakes to Avoid

Using a linear scale for ratio data is one of the most frequent errors. It makes small protective effects look trivially close to the null while making harmful effects look disproportionately large. Always use a log scale for odds ratios and risk ratios.

Omitting the reference line, or placing it at the wrong value, is another easy mistake. For mean differences, the line goes at 0. For ratios, it goes at 1. Without this line, readers cannot visually assess statistical significance.

Displaying all studies with equal-sized squares ignores the weighting that makes a meta-analysis meaningful. A tiny pilot study with 20 participants should not visually dominate the plot the same way a large trial with 5,000 participants does. If your software doesn’t automatically scale square sizes by weight, add the weight values manually and adjust marker sizes to reflect them.

Finally, presenting a forest plot without reporting heterogeneity is incomplete. A tidy-looking diamond means little if the studies behind it are wildly inconsistent. Always include I-squared and consider adding prediction intervals, which show the range of effects you’d expect in a future study. In metafor, this is as simple as setting addpred=TRUE.