How to Make a Phylogenetic Tree: Step-by-Step

Building a phylogenetic tree follows a consistent pipeline: collect sequences, align them, choose an evolutionary model, construct the tree, then evaluate and visualize it. Whether you’re working with a handful of genes or hundreds of genomes, these core steps stay the same. The differences lie in which software and methods you pick at each stage.

Step 1: Collect Your Sequence Data

Every phylogenetic tree starts with molecular sequences, typically DNA, RNA, or protein. You’ll usually download these from public databases like NCBI’s GenBank or Ensembl. The key is selecting orthologous sequences, meaning genes that share a common ancestor across the species you’re comparing, rather than paralogs (duplicated copies within a single species). Mixing up orthologs and paralogs is one of the fastest ways to produce a misleading tree.

Choose sequences that are long enough to contain useful variation but conserved enough that they can be aligned. For broad evolutionary comparisons spanning different kingdoms of life, ribosomal RNA genes (like 16S for bacteria or 18S for eukaryotes) are a standard choice because they evolve slowly. For closely related species, faster-evolving genes or even whole mitochondrial genomes give better resolution.

Step 2: Align the Sequences

Multiple sequence alignment arranges your sequences so that equivalent positions line up in columns. This step is critical because the tree-building algorithm interprets each column as a set of characters that evolved from the same ancestral position. A bad alignment feeds the algorithm false signals.

Three tools dominate this step: MAFFT, MUSCLE, and Clustal Omega. In benchmark tests across multiple reference datasets, MAFFT consistently ranks among the most accurate alignment programs while remaining fast enough to handle large datasets. It completed alignments of all benchmark sets in under 2.5 hours, making it the fastest of the top-accuracy tools. MUSCLE and ClustalW are faster and use less memory, but ClustalW in particular produces the lowest accuracy on full-length sequences in nearly all test cases. MUSCLE falls somewhere in between, scoring above average on some datasets and below on others. For most projects, MAFFT is the safest default.

After alignment, trim poorly aligned regions. Columns full of gaps or ambiguous matches add noise. Tools like trimAl or Gblocks can automate this, removing positions where the alignment is unreliable.

Step 3: Select a Substitution Model

A substitution model describes the rules of how nucleotides or amino acids change over time. Choosing the right one matters because it shapes how the algorithm calculates evolutionary distances between sequences.

The simplest DNA model, Jukes-Cantor, assumes all four nucleotides occur at equal frequencies and that every type of substitution happens at the same rate. That’s rarely realistic. The Kimura 2-parameter model improves on this by recognizing that transitions (changes between chemically similar bases, like A to G) happen more frequently than transversions (changes between dissimilar bases, like A to T). The most flexible standard model, GTR (General Time Reversible), allows a different rate for every possible substitution type and lets each nucleotide have its own frequency. Most real datasets fit somewhere between Kimura and GTR.

You don’t need to guess which model fits your data. Automated tools test dozens of models against your alignment and rank them statistically. ModelTest-NG correctly identifies the true model for 81% of simulated DNA datasets and runs up to 510 times faster than its predecessor, jModelTest. ModelFinder, built into the popular tree-building program IQ-TREE, is another strong option, though it identified the correct DNA model in 70% of simulations compared to ModelTest-NG’s 81%. For protein data, the two tools perform similarly. If you’re already using IQ-TREE, ModelFinder’s built-in integration makes the workflow seamless.

Step 4: Build the Tree

This is where you choose your tree-building method. The four main approaches differ in speed, accuracy, and the type of information they use.

Neighbor-Joining

Neighbor-joining (NJ) is a distance-based method. It converts your alignment into a matrix of pairwise distances, then clusters sequences by finding the closest neighbors iteratively. It’s fast, intuitive, and works well when you need a quick look at relationships. Interestingly, simulation studies have shown that NJ using simple p-distances (the raw proportion of differing sites) often recovers the true tree as accurately as, or better than, more complex methods. It’s a strong choice for exploratory analysis or when computational resources are limited.

Maximum Parsimony

Maximum parsimony finds the tree that requires the fewest total evolutionary changes to explain the data. It doesn’t use a substitution model, which makes it conceptually simple. The downside is that it can be misled when some lineages evolve much faster than others, a problem called long-branch attraction. It’s less commonly used today for final analyses but still has a role in certain contexts.

Maximum Likelihood

Maximum likelihood (ML) evaluates every possible tree topology and asks: given my substitution model, which tree makes the observed data most probable? This approach is statistically rigorous and widely considered the standard for published phylogenies. The tradeoff is computation time, which grows substantially with more sequences. IQ-TREE and RAxML are the two most widely used ML programs, both capable of handling hundreds or thousands of sequences.

Bayesian Inference

Bayesian inference works similarly to ML but frames the question differently: given the data and a prior expectation, what is the probability of each tree? It produces posterior probabilities for every branch, which many researchers find more intuitive than bootstrap values. The main tool here is MrBayes, though BEAST is preferred for datasets where you want to estimate divergence times. Bayesian analyses typically take longer than ML because they sample thousands of trees to approximate the probability distribution.

Step 5: Root the Tree

An unrooted tree shows relationships but doesn’t indicate direction of evolution. Rooting establishes which node is the common ancestor, giving the tree a time direction from past to present.

The most reliable approach is outgroup rooting. You include a species you already know diverged before the group you’re studying. For example, if you’re building a tree of mammals, you might include a reptile as an outgroup. The root falls on the branch connecting the outgroup to everything else. Good outgroups should have a relatively low substitution rate and similar base composition to your ingroup. Using multiple outgroup species from the same sister group, rather than a single distantly related species, reduces the risk of long-branch attraction pulling the outgroup to the wrong position.

When no suitable outgroup exists, midpoint rooting places the root halfway between the two most distant tips on the tree. This works well when all lineages evolve at roughly equal rates, but it can misplace the root if some branches evolve much faster than others. Midpoint rooting is commonly used for viral datasets, where outgroups are often unknown or too divergent to align reliably.

Step 6: Evaluate Branch Support

A tree without support values is just a hypothesis with no indication of confidence. Bootstrapping is the most common evaluation method. It resamples columns from your alignment with replacement, rebuilds the tree hundreds or thousands of times, and reports how often each branch appears. A bootstrap value of 70% or higher is generally considered moderate support; above 90% is strong.

Bayesian analyses produce posterior probabilities instead, which range from 0 to 1. Values above 0.95 are typically considered well-supported. Posterior probabilities and bootstrap values are not interchangeable: posterior probabilities tend to run higher, so a branch with 0.95 posterior probability does not necessarily correspond to 95% bootstrap support.

Step 7: Visualize the Tree

The raw output of most tree-building programs is a text file in Newick or Nexus format. Newick stores the tree as nested parentheses with branch lengths, something like (A:0.1,B:0.2,(C:0.3,D:0.4):0.5). Nexus format wraps this same structure inside a block of additional metadata and commands, making it compatible with programs like PAUP and MrBayes. Both formats are plain text and can be opened in any tree viewer.

For quick, local visualization, FigTree is a free desktop application that handles most basic tasks: displaying branch lengths, coloring clades, and exporting figures. MEGA is another desktop option that combines tree building and visualization in one interface, useful for beginners who want an all-in-one workflow.

For publication-quality figures, Interactive Tree of Life (iTOL) is a web-based tool that handles complex annotations. You can drag and drop alignment files onto trees, add colored metadata rings for traits like habitat or host species, display timescales alongside branches, and manually annotate trees with shapes and labels directly in the browser. iTOL supports text labels, MEME motifs, and dataset legends, and exports in vector formats suitable for journals. Its spreadsheet-like editing interface updates the tree visualization in real time as you modify datasets.

Recommended Software Combinations

If you’re new to phylogenetics and want a graphical interface, MEGA handles sequence alignment, model selection, tree building (NJ, parsimony, and ML), and visualization in a single program. It’s the lowest barrier to entry.

For a command-line workflow with more flexibility, a common pipeline looks like this:

Alignment: MAFFT for accuracy across diverse datasets
Trimming: trimAl to remove unreliable alignment columns
Model selection and tree building: IQ-TREE, which integrates ModelFinder and ML tree inference in one run
Visualization: iTOL for annotated, publication-ready figures, or FigTree for quick local viewing

For Bayesian analysis, MrBayes is the standard for topology estimation, while BEAST is better suited when your goal includes estimating when lineages diverged. Both accept Nexus-formatted alignments and produce trees that can be visualized in FigTree or iTOL.

In R, the packages ape, phangorn, and ggtree cover the full pipeline from distance matrices through ML and parsimony trees to detailed visualizations. This route gives you the most control over every parameter and integrates naturally into reproducible analysis scripts.