Why Do Mitochondrial and Nuclear Gene Trees Disagree?

Different genes within the same genome often tell different evolutionary stories. When researchers build family trees (phylogenies) from individual genes, those trees frequently disagree with each other and with the “true” species tree representing how populations actually split apart. This isn’t an error or artifact. It reflects real biological processes that cause different stretches of DNA to follow genuinely different genealogical paths through evolutionary history.

Incomplete Lineage Sorting: The Primary Culprit

The most common reason nuclear gene trees disagree is a process called incomplete lineage sorting, or ILS. Here’s what it means in plain terms: any ancestral population carries genetic variation, with multiple versions of each gene floating around. When that population splits into two species, not all of that variation gets neatly sorted. If a second split happens soon after the first, the old genetic variants from the original ancestor can end up distributed across descendant species in patterns that don’t match the order those species actually formed.

The result is that some genes make Species A look most closely related to Species C, while other genes group Species A with Species B, even though the true branching order is something else entirely. This isn’t sampling error. It represents a genuine difference in genealogical history between loci. ILS is especially pronounced during rapid speciation events, where populations split in quick succession without enough time for ancestral genetic variation to settle into species-specific patterns.

The great ape lineage offers a striking example. When the gorilla genome was sequenced, researchers found that across 30% of the genome, gorilla is genetically closer to either humans or chimpanzees than humans and chimpanzees are to each other. The true species tree groups humans and chimps as closest relatives, but nearly a third of our DNA tells a different story because of ILS from the ancestral population that gave rise to all three lineages. That discordance is rarer around protein-coding genes, suggesting natural selection has worked to sort those regions more cleanly.

Hybridization and Gene Flow Between Species

Species don’t always stay reproductively isolated. When individuals from different species interbreed, genes can cross species boundaries, a process called introgression. This means some genes in your genome may have arrived not through your species’ direct line of descent but through ancient hybridization with a neighboring species. Those genes carry a phylogenetic signal that reflects the donor species’ history, not your own.

Advances in genomic sequencing have made biologists increasingly aware that life’s history looks less like a branching tree and more like a network, with lineages occasionally reconnecting. Researchers have coined the term “xenoplasy” to describe situations where present-day traits are shared between species not because of common ancestry through normal speciation but because genes were transferred through hybridization. Unlike ILS, xenoplasy doesn’t require deep ancestral polymorphism. It can happen whenever two species exchange genetic material, even recently.

The practical consequence is that analyzing traits on a species tree alone, without accounting for gene flow, can produce misleading conclusions about how those traits evolved.

Why Mitochondrial and Nuclear Trees Often Disagree

A particularly common form of gene tree disparity shows up when comparing mitochondrial DNA to nuclear DNA. Mitochondrial genomes are inherited only from the mother, don’t recombine, and have a much smaller effective population size than nuclear genes. These properties make mitochondrial DNA especially susceptible to a process called mitochondrial capture, where an entire mitochondrial genome from one species replaces another’s through hybridization.

Because mitochondrial DNA has a smaller population size, introgressed mitochondrial genes reach fixation (becoming the only version in a population) much faster than nuclear genes do. This means mitochondrial introgression can occur with minimal or no detectable nuclear introgression, making it look like two species are closely related based on their mitochondria even when their nuclear genomes tell a completely different story.

Other factors contributing to mitonuclear discordance include population size fluctuations (especially at the edges of a species’ range), differences in how males and females disperse, and natural selection. If a beneficial mitochondrial mutation arises in a population, selection may favor nuclear gene variants that restore compatibility with the new mitochondrial type, further decoupling the two genomic signals.

Rapid Radiations Make It Worse

The severity of gene tree discordance depends heavily on how much time elapsed between successive speciation events. When lineages split in rapid succession, as happens during adaptive radiations where species diversify quickly into new ecological niches, the intervals between splits are so short that ancestral polymorphism has almost no time to sort. The greatest deviations from expected gene tree agreement occur when both the time since speciation and the time between speciation events are short.

This makes some of the most biologically interesting groups (cichlid fishes, Darwin’s finches, Hawaiian silverswords) among the hardest to resolve phylogenetically. Every gene tells a slightly different story, and no single gene reliably recovers the species tree. Interestingly, genes involved in reproductive isolation between species are slightly more likely to have discordant trees than background genes, because discordant genealogies create more opportunities for incompatible genetic interactions to arise between diverging populations.

Methodological Sources of Apparent Conflict

Not all gene tree disparity comes from biology. Some of it is introduced by how we reconstruct those trees. Two major analytical approaches handle conflicting gene signals very differently. Concatenation methods combine all genes into one large dataset and assume every locus shares the same underlying tree. This can produce a strongly supported but incorrect species tree when the biological reality involves widespread discordance. Summary methods take the opposite approach: they estimate individual gene trees first, then combine those trees to infer the species tree. These are computationally fast and can handle thousands of loci, but they treat estimated gene trees as if they were known perfectly, ignoring uncertainty in each individual tree.

Full likelihood methods under what’s called the multispecies coalescent model offer a more rigorous approach. They average over unknown gene trees and properly account for uncertainty, but the computation involved is intensive. For large genomic datasets, it can be prohibitive. Additional artifacts like insufficient data at individual loci, substitution model misspecification, and homoplasy (where unrelated lineages converge on similar sequences by chance) can all introduce false signals of discordance or mask real biological conflict.

Gene Tree Disparity Is the Rule, Not the Exception

If this all sounds like a special problem affecting a few tricky groups, it’s not. Analyses across nine multilocus datasets spanning a broad range of organisms found extensive variation in the evolutionary signal recovered from different genes. Different gene trees produced very different rankings of how species relate to each other, and the correlation between individual gene tree results and the species tree result was often weak. This pattern held regardless of the type of organism or the number of genes included.

The takeaway is that any single gene provides only one imperfect window into evolutionary history. The disparity between nuclear gene trees reflects the genuinely complex, messy process of speciation, where populations carry enormous genetic variation, species boundaries are sometimes porous, and evolutionary timescales don’t always allow for clean genetic sorting. Recognizing and modeling that complexity, rather than treating it as noise, has become one of the central challenges in modern evolutionary biology.