A phylogeny is the evolutionary history of a group of organisms, showing how they are related through common ancestors. You’ve almost certainly seen one before, even if you didn’t know the name: it looks like a branching tree diagram, with lines splitting apart to show where species diverged from one another over time. The term itself was coined in 1866 by the German biologist Ernst Haeckel, and it has since become one of the most fundamental tools in biology.
How a Phylogenetic Tree Works
A phylogenetic tree is read much like a family tree. The base of the tree, called the root, represents the oldest common ancestor of every organism shown. The tips of the branches represent living (or sometimes extinct) species. As you move from root to tips, you move forward in time.
Where a branch splits into two, that point is called a node. Each node represents a speciation event, a moment in the past when one lineage diverged into two. If two species share a recent node, they are closely related. If you have to trace much farther back toward the root to find their shared node, they are more distantly related. For example, in a tree showing species A, B, and C, if A and B share a more recent common ancestor than either shares with C, then A and B are considered more closely related to each other.
A complete branch of the tree that includes a common ancestor and every one of its descendants is called a clade. Clades are the building blocks of modern evolutionary classification: mammals form a clade, birds form a clade, and flowering plants form a clade. Each represents a natural grouping defined by shared ancestry rather than by superficial similarity.
Phylogeny vs. Taxonomy
Taxonomy is the practice of naming and categorizing organisms into ranks like kingdom, phylum, and species. Phylogeny is the map of how those organisms are actually related through evolution. The two often overlap, but not always. Traditional taxonomy sometimes grouped organisms by shared physical features that turned out to be coincidental rather than inherited from a common ancestor. Bats and birds both fly, for instance, but phylogeny shows their wings evolved independently.
Modern classification increasingly tries to reflect phylogeny, so that every named group corresponds to a true clade. When a named group includes a common ancestor but leaves out some of its descendants, biologists call it paraphyletic. “Reptiles” in the traditional sense are a classic example: the group excludes birds, even though birds descend from the same ancestor as crocodiles and lizards. A group that lumps together organisms from entirely separate lineages, with no single shared ancestor, is called polyphyletic. These distinctions matter because only a clade (a monophyletic group) tells you something reliable about shared biology.
How Scientists Build Phylogenies
Biologists originally inferred evolutionary relationships by comparing physical features: bone structure, tooth shape, leaf arrangement. Morphological data still play a role, especially for fossils, but they have a significant limitation. Some traits are hard to define precisely, and different researchers can interpret the same structure differently.
DNA sequencing has largely taken over. Every position in a DNA sequence is a character with four possible states (A, C, G, and T), so even a short stretch of genetic code supplies hundreds of data points. Those states are completely unambiguous, unlike the shape of a bone or the color of a feather, and the data translate directly into numbers for mathematical analysis. Protein sequences are still used in some contexts, but DNA is preferred because it captures mutations that don’t change the protein, giving a more detailed picture of evolutionary change.
Once researchers have their sequences, they align them and apply tree-building algorithms. One common approach calculates how genetically distant each pair of species is and groups the closest pairs together first. Another approach, called maximum parsimony, looks for the tree that requires the fewest total mutations to explain the observed differences. Both methods have trade-offs, and biologists often compare results from multiple approaches before settling on a tree.
Why Phylogenies Matter Outside the Lab
Phylogenies are far more than academic diagrams. One of their most powerful applications is tracking infectious disease. During outbreaks of foodborne illness, public health officials sequence the pathogen’s DNA and use phylogenetic analysis to link cases to a common source. A network called GenomeTrakr, for example, connects sequencing laboratories that share genomic data on foodborne pathogens, helping investigators quickly match clinical samples to contaminated products. In a 2012 outbreak of Salmonella in the United States, standard lab typing couldn’t distinguish outbreak strains from background strains. Whole-genome sequencing and phylogenetic analysis traced the infection back to a fish processing plant in India.
The same logic applies to viruses. Phylogenetic analysis of HIV has revealed how specific viral lineages are selected during mother-to-child transmission, and studies of Zika virus identified mutations in certain lineages that increased the virus’s ability to interact with human proteins involved in brain development. Tracking how influenza and other viruses evolve under pressure from antiviral drugs also depends on building and reading phylogenies.
More broadly, the shift from relying solely on patient symptoms to using molecular epidemiology has improved the ability to trace pathogens to their origins, interrupt transmission chains, and select appropriate treatments.
The Big Picture: The Tree of Life
The most ambitious phylogeny is the tree of life itself. In 1977, Carl Woese used genetic sequences from a molecule found in all living cells to reveal an entirely new domain of life, the Archaea, which had been hidden among bacteria. His work established three domains: Bacteria, Archaea, and Eukarya (the group that includes animals, plants, fungi, and protists).
That framework still holds, though its fine structure is actively debated. Some recent phylogenetic analyses suggest that eukaryotes actually branch from within the Archaea, which would technically make the tree a two-domain system rather than three. Others maintain that Archaea and Eukarya are sister groups, each other’s closest relatives, which preserves the three-domain picture Woese originally proposed. The consensus, as one recent review put it, is that the living world is still best represented by the trinity of Bacteria, Archaea, and Eukarya, even as researchers continue refining where exactly the branches connect.
Common Mistakes When Reading a Tree
The most frequent error is treating the tips of a tree as a ladder of progress, reading left to right as if the species on the right are “more evolved.” Every living species at the tips of a phylogeny has been evolving for exactly the same amount of time since their shared ancestor. Humans are not more evolved than mushrooms; they simply took a different path.
Another common mistake is assuming that two species placed next to each other on the tips must be closely related. What actually determines relatedness is how recently they share a common ancestor (a node), not their position along the edge of the diagram. Branches can be rotated around any node without changing the relationships the tree represents, the same way you could swap the left and right sides of a family tree and all the parent-child connections would still be identical.
Branch length can also carry information, but only in certain types of trees. In some phylogenies, longer branches mean more genetic change or more elapsed time. In others, branch length is arbitrary and only the branching pattern matters. Checking whether the tree includes a scale bar is the quickest way to tell.

