What Is a Phylogenetic Tree and How Do You Read One?

A phylogenetic tree is a diagram that maps out how different species, organisms, or genes are related through evolution. Think of it like a family tree, but instead of tracking grandparents and cousins, it tracks how living things descended from shared ancestors over millions of years. Scientists build these trees using DNA sequences, protein data, or physical traits to reconstruct the branching history of life.

How a Phylogenetic Tree Is Structured

Every phylogenetic tree is built from two basic elements: nodes and branches. Branches are the lines connecting everything together, and they represent the evolutionary path between ancestor and descendant. Nodes are the points where branches split apart.

There are two kinds of nodes. The tips of the tree (called external nodes or “leaves”) represent the actual species or organisms being compared. These are things we can observe and collect data from, whether living species or fossils. Internal nodes, where branches fork, represent hypothetical common ancestors. You can’t go out and find these ancestors in the wild. They’re inferred from the data. The very first internal node at the base of the tree is the root, representing the oldest common ancestor of everything else on the diagram.

Branches can carry different kinds of information depending on the type of tree. In a cladogram, branch lengths don’t mean anything specific. The diagram only shows which species are more closely related to each other. In a phylogram, branch length reflects the amount of evolutionary change, so a longer branch means more genetic or physical difference accumulated along that lineage. In a chronogram, branch lengths represent actual time, letting you see not just who’s related to whom but roughly when lineages split apart.

Rooted vs. Unrooted Trees

A rooted tree has a clearly defined starting point: the common ancestor at the base. From there, you can trace the path of evolution forward through time, following branches as they split into new lineages. This directionality is what makes rooted trees useful for understanding ancestry and the order in which groups diverged.

An unrooted tree, by contrast, shows how organisms are related to each other without specifying where the common ancestor sits. It tells you which species share more similarities and which are more distant, but it doesn’t indicate the direction of evolution. Scientists can convert an unrooted tree into a rooted one by adding an “outgroup,” a distantly related species that serves as a reference point to anchor the tree’s base.

How Scientists Build These Trees

Early phylogenetic trees were built by comparing physical traits: bone structure, body shape, tooth patterns. Modern trees overwhelmingly rely on molecular data, especially DNA and protein sequences. By comparing the genetic code of different organisms and measuring how much their sequences differ, researchers can estimate how long ago two lineages split from a common ancestor.

The computational methods behind tree-building fall into a few major categories. Distance-based methods calculate how genetically different each pair of organisms is and group the most similar ones together. These are fast but relatively simple. Maximum likelihood methods take a more sophisticated approach, testing millions of possible tree arrangements and selecting the one that best explains the observed genetic data given a model of how DNA changes over time. These are powerful but computationally expensive.

Bayesian inference is another widely used approach. It works by assigning a probability to each possible tree, then using a sampling technique to explore the enormous space of potential trees and zero in on the most likely ones. Bayesian methods produce a probability score for each branching pattern, giving researchers a measure of confidence in the result. Studies have found Bayesian inference to be relatively robust even when the underlying assumptions about how sequences evolve aren’t perfectly met, making it a practical alternative when computing power is limited.

How to Read One

The single most useful skill when looking at a phylogenetic tree is finding the most recent common ancestor of any two species. To do this, pick two species at the tips of the tree and trace their lineages backward (toward the root) until the lines meet. The node where they converge is their most recent common ancestor. The closer that node is to the tips, the more recently those species diverged and the more closely related they are.

A key concept for interpreting trees is the clade, also called a monophyletic group. A clade includes a single ancestor and every one of its descendants, no exceptions. You can spot a clade by imagining you snip a single branch off the tree. Everything on that clipped branch forms a clade. If you’d need more than one cut to separate a group of organisms from the rest of the tree, that group is not a true clade. These incomplete groupings are called paraphyletic (if they include the ancestor but leave out some descendants) or polyphyletic (if the members descend from different ancestors entirely). Only clades reflect genuine evolutionary units, which is why biologists prefer to classify organisms into clades whenever possible.

Tracking Disease Outbreaks

Phylogenetic trees have become a frontline tool in public health. When a pathogen spreads through a population, whole-genome sequencing of samples from different patients lets researchers build a tree of the virus or bacterium as it evolves in real time. The shape of that tree reveals how the disease is spreading.

In tuberculosis outbreaks, for example, researchers have used phylogenetic tree shapes to distinguish between two very different transmission patterns. In one outbreak driven by a “super-spreader” (a single individual infecting many others), the tree had a distinctive shape that classifiers correctly identified 75% of the time. A second outbreak with more even, wave-like transmission produced a different tree shape, correctly classified as homogeneous about 76% of the time. Both results matched what traditional epidemiological investigation had found, but the tree-shape analysis reached the same conclusions using genetic data alone. This approach became widely visible during the COVID-19 pandemic, when phylogenetic trees helped track how new variants emerged and spread across borders.

Discovering New Drugs

One of the more surprising uses of phylogenetic trees is in the search for new medicines. The logic is straightforward: if a species produces a useful compound, its close relatives on the tree may have inherited the same ability from a shared ancestor.

This strategy proved its value in the search for paclitaxel, a cancer-fighting compound originally found in Pacific yew bark. Harvesting the bark was unsustainable, so researchers looked at species on neighboring branches of the yew’s phylogenetic tree. That search eventually led to microbial plant symbionts that could produce the compound without destroying the trees. The same principle applies to venom. Venom production is a heritable trait often shared among closely related species, so by mapping known venomous species onto a phylogenetic tree, biologists can predict which unstudied species likely produce venom too. When this approach was applied to fish, researchers discovered that more than 1,200 species not previously known to be venomous probably are. Similar phylogenetic strategies are guiding the search for medically useful compounds in snake, lizard, and snail venom.

Why the Tree Metaphor Has Limits

Phylogenetic trees work well for organisms that pass genes from parent to offspring in a straight line. But evolution doesn’t always cooperate. Bacteria frequently swap genes sideways between unrelated species through a process called horizontal gene transfer. When that happens, no single tree can capture the full picture, and scientists sometimes use network diagrams instead. Hybridization between species (common in plants) creates similar complications. The tree of life, in other words, is more like a tangled bush in some places. Phylogenetic trees remain the best tool we have for visualizing evolutionary relationships, but they’re a simplification of a messier reality.