A systematic review follows a structured, transparent process to find, evaluate, and synthesize all available research on a specific question. The average systematic review takes about 18 months from start to finish, according to Cochrane, and involves distinct phases: defining your question, registering a protocol, searching the literature, screening studies, extracting data, assessing quality, and synthesizing results. Each phase has established standards designed to minimize bias and make your work reproducible.
Define a Focused Research Question
Everything in a systematic review flows from the research question, and the most widely used tool for building one is the PICO framework. PICO breaks your question into four components: Population (who you’re studying), Intervention (what treatment or exposure you’re examining), Comparator (what you’re comparing it against), and Outcome (what result you’re measuring). For example, combining “general population” as the population, “fish oil supplements” as the intervention, “isocaloric fat placebo” as the comparator, and “all-cause mortality” as the outcome produces a precise, answerable question about whether fish oil supplements reduce death rates compared to placebo.
A vague question leads to an unmanageable search. A question that’s too narrow may return almost no studies. Spend time refining each PICO element before moving forward, because changing your question later means restarting your search strategy from scratch.
Register a Protocol Before You Start
Prospective registration means publishing the details of your review plan before you begin collecting or analyzing data. This serves two purposes: it allows others to verify that you followed your stated methods, and it prevents unintended duplication of effort by making your planned review visible to other researchers.
Several platforms accept protocol registrations. PROSPERO, the most widely used registry for health-related reviews, requires 28 mandatory fields covering your title, authors, planned methods, and contact details, plus 12 optional fields. Other options include Research Registry (28 mandatory fields, 6 optional), INPLASY (24 mandatory, 9 optional), and OSF Registries, which has a lighter registration with as few as 5 mandatory fields depending on the template you choose. Your protocol should specify your research question, eligibility criteria, search strategy, screening process, data extraction plan, and how you intend to synthesize results. Think of it as a contract with yourself and your readers about what you planned to do before the data influenced your decisions.
Build and Execute a Search Strategy
Your search strategy needs to be comprehensive enough to capture all relevant studies and specific enough to keep the volume manageable. Start by translating each PICO element into search terms, including synonyms, alternate spellings, and related concepts. Combine these using Boolean operators (AND, OR, NOT) to construct search strings tailored to each database you plan to search.
Run your search across multiple databases. For biomedical topics, this typically means PubMed/MEDLINE, Embase, and the Cochrane Central Register of Controlled Trials at minimum. Many review teams also search CINAHL, PsycINFO, or discipline-specific databases depending on the topic. Working with a research librarian at this stage is common and often dramatically improves the quality of your search.
Grey literature matters too. Limiting your review to published journal articles introduces publication bias, since studies with positive results are more likely to get published. Grey literature sources include theses and dissertations, government reports, conference papers, committee reports, clinical trial registries like ClinicalTrials.gov, and ongoing research databases. The CDC notes that literature searches alone can take anywhere from a few weeks to several months, depending on the topic’s breadth.
Screen Studies Systematically
Screening happens in two rounds. First, you review titles and abstracts to remove obviously irrelevant records. Second, you retrieve the full text of remaining studies and assess them against your predefined eligibility criteria. At least two reviewers should screen independently at each stage, then compare decisions and resolve disagreements through discussion or a third reviewer.
Several software tools can streamline this process. Covidence handles independent title/abstract screening, full-text screening, data extraction, and risk of bias assessment in one platform. Rayyan is a free tool that uses tagging and filtering to organize references during screening. DistillerSR manages literature collection, screening, and assessment together. SUMARI supports ten different review types and covers the entire process from protocol development through report writing. These tools track reviewer decisions automatically, which makes the process auditable.
You’ll document your screening results in a PRISMA flow diagram. The 2020 version requires you to report the number of records identified, records excluded before screening (duplicates or those flagged by automation tools), records excluded after title/abstract screening, reports retrieved for full-text evaluation, reports that couldn’t be retrieved, reports excluded at full-text stage with reasons (such as ineligible study design or population), and the final number of included studies. If you used automation tools at any stage, the diagram should indicate how many records were excluded by a human versus by software.
Extract Data From Included Studies
Data extraction means pulling specific, predetermined information from each study into a standardized form. At minimum, you’ll collect study characteristics: authors, year, country, study design, sample size, population details, intervention and comparator details, outcome measures, and key results. Two reviewers should extract data independently to catch errors.
For each included study, you’ll need to record summary statistics for each group and an effect estimate with its precision, typically a confidence interval. If you’re reviewing studies that compare an intervention against a control using a binary outcome (like recovery versus no recovery), you’ll report the relative strength of treatment effects from each study individually before combining them.
Design your extraction form before you start, and pilot it on two or three studies to identify missing fields or ambiguous categories. Most review management software includes built-in extraction templates, or you can build your own in a spreadsheet.
Assess the Quality of Each Study
Not all studies are created equal, and your review needs to account for that. For randomized trials, the most widely used tool is the Cochrane Risk of Bias 2 (RoB 2) framework, which evaluates each study across five domains:
- Randomization process: whether participants were properly randomized and whether allocation was concealed
- Deviations from intended interventions: whether participants or researchers deviated from the assigned treatment in ways that could affect results
- Missing outcome data: whether dropout rates or lost data could have skewed results
- Measurement of the outcome: whether the outcome was measured consistently and without knowledge of treatment assignment
- Selection of the reported result: whether the authors reported all planned analyses or selectively reported favorable ones
For observational studies, different tools apply, such as the Newcastle-Ottawa Scale. The key principle is the same: rate each study’s risk of bias before synthesizing results, so you can weight strong evidence more heavily or conduct sensitivity analyses excluding high-risk studies.
Synthesize the Evidence
Synthesis is where you combine results across studies, and you have two main approaches: meta-analysis (statistical pooling) or narrative synthesis (qualitative summary). The choice depends on whether your included studies are similar enough to combine meaningfully.
A meta-analysis is possible when you have association estimates from at least two studies, along with their standard errors or 95% confidence intervals. But statistical feasibility alone doesn’t make a meta-analysis appropriate. You should avoid pooling results when the number of studies is too small, when essential information for combining results is missing, or when the studies are too different from each other in terms of populations, interventions, or outcomes.
If you do proceed with a meta-analysis, you’ll need to assess how much the results vary across studies. Two common measures are the Cochran Q test, which checks whether differences between studies are statistically significant, and the I² statistic, which quantifies the proportion of variation due to real differences rather than chance. The Cochrane Collaboration classifies I² values of 0 to 40% as likely unimportant heterogeneity, 30 to 60% as moderate, 50 to 90% as substantial, and 75 to 100% as considerable. These ranges overlap intentionally, because determining whether heterogeneity is “too high” involves subjective judgment about clinical and methodological differences between studies, not just a statistical cutoff.
When meta-analysis isn’t feasible or meaningful, narrative synthesis involves summarizing findings in structured text and tables, organizing results by subgroup, outcome, or study design, and describing the direction and consistency of effects across studies. A narrative synthesis is not a lesser product. It’s the appropriate choice when the evidence base is too diverse to pool statistically.
Report Your Review Using PRISMA
The PRISMA 2020 statement is the standard reporting guideline for systematic reviews. It provides a checklist of items your final manuscript should include, covering everything from how you framed your question and searched for studies to how you assessed bias and synthesized results. Following PRISMA doesn’t change how you conduct the review, but it ensures you report every step transparently enough for someone else to evaluate or replicate your work.
Your final report should include a complete bibliography of included studies, the PRISMA flow diagram documenting your screening process, summary tables of study characteristics, individual study results with effect estimates, and (if applicable) pooled results from meta-analysis with forest plots. Most journals that publish systematic reviews expect PRISMA compliance and will ask you to submit the completed checklist alongside your manuscript.
Practical Tips for Managing the Process
With an 18-month average timeline, keeping a systematic review on track requires planning. Assemble your team early: you need at least two reviewers for screening and extraction, and a librarian or information specialist for search development is strongly recommended. Set internal deadlines for each phase and build in buffer time for the screening stage, which consistently takes longer than teams expect.
Use reference management software from the start to handle deduplication and organize records across databases. Choose your screening platform before you run your searches, so you can import results directly rather than reformatting later. And keep a detailed log of every decision you make that deviates from your protocol, because reviewers and editors will ask about discrepancies between your registered protocol and your final report.

