What Makes Your DNA Different From Someone Else’s?

Any two people share about 99.6% of their DNA. The remaining 0.4% translates to roughly 27 million points of difference scattered across your 6 billion nucleotide base pairs. That sliver of variation is what makes you genetically unique, and it arises from several distinct types of changes in your genetic code.

Most Differences Are Single-Letter Swaps

Your DNA is written in a four-letter alphabet: A, T, C, and G. The most common type of variation between two people is a single-nucleotide polymorphism, or SNP (pronounced “snip”), where just one letter differs at a specific spot. Across the human genome, researchers have mapped over 1.4 million of these single-letter swaps, appearing on average once every 1,900 base pairs. About 60,000 of them fall within regions that directly code for proteins.

Most SNPs have no obvious effect on your body. But some change a protein’s structure or how efficiently it’s produced, influencing everything from eye color to how you metabolize caffeine. And when thousands of SNPs each nudge a trait by a tiny amount, they add up. Height, for instance, is shaped by the combined effects of many thousands of these small variations. Researchers can now calculate a “polygenic risk score” that sums up all of a person’s relevant SNPs to estimate their genetic predisposition for a given trait. For height, common genetic variants explain roughly 49% of the variation between people.

Larger Chunks of DNA Can Be Copied or Deleted

Single-letter changes get the most attention, but bigger structural differences also set people apart. Copy number variations, or CNVs, are stretches of DNA that are duplicated, deleted, or rearranged. These aren’t subtle: CNVs occupy an estimated 5 to 12% of the human genome, and a typical population carries 3,000 to 4,000 distinct CNVs covering 4 to 6% of its total genomic sequence.

Because CNVs can span thousands or even millions of base pairs, they affect far more raw DNA per event than a single SNP does. They influence traits and disease susceptibility in meaningful ways, with known links to conditions like autoimmune disorders, autism, and differences in immune response to infections like HIV. This is a big reason the older claim that humans are “99.9% identical” has been revised downward to about 99.6%. That original figure only counted single-letter changes and missed the larger rearrangements.

98% of Your DNA Doesn’t Code for Proteins

Only about 2% of your genome contains instructions for building proteins. The other 98% was once dismissed as “junk DNA,” but it turns out to be enormously important. This non-coding DNA contains regulatory sequences: promoters, enhancers, silencers, and insulators that control when, where, and how much of each gene gets activated. A variation in one of these regulatory regions won’t change the protein itself, but it can change how much of that protein your cells produce, or which tissues produce it, or at what stage of development it appears.

This matters because over 90% of the genetic variants linked to diseases and measurable traits in large population studies fall within non-coding regions. A variation in a regulatory stretch might, for example, reduce how tightly a specific protein binds to a gene’s “on switch,” dialing down that gene’s activity in a particular tissue. These regulatory differences help explain why two people can have the same version of a gene but express it very differently.

Same DNA, Different Settings: Epigenetics

Your DNA sequence isn’t the whole story. Epigenetic marks are chemical tags that sit on top of your DNA and influence which genes are active or silent, without changing the underlying letters. The best-studied type is DNA methylation, where a small molecule attaches to certain spots on the DNA and typically dials down the gene beneath it. Histone modifications are another layer: chemical changes to the proteins that DNA wraps around, which can loosen or tighten access to specific genes.

Identical twins offer the clearest proof that epigenetics creates individuality. Despite starting life with virtually the same DNA sequence, identical twins accumulate epigenetic differences over time. Twin pairs who are older, have spent less of their lives together, or have had more different health histories show the greatest epigenetic divergence. These differences appear across multiple tissue types, including blood cells, the lining of the mouth, and gut tissue. Epigenetic variation helps explain why one identical twin might develop a disease while the other doesn’t, even though their genetic code is nearly the same.

Even “Identical” Twins Aren’t Truly Identical

Identical twins form from a single fertilized egg, so they’re often described as genetic copies of each other. But from the moment that egg splits, each twin’s cells begin accumulating their own random mutations. These are called postzygotic somatic mutations, and research on 30 pairs of identical twins found an average of about 86 such mutations per pair (in twins with very high genetic concordance). Some pairs had as few as 49 and others as many as 164.

These mutations aren’t tied to age or sex. They arise from errors during cell division that happen very early in development, meaning each twin carries slightly different DNA from the start of life. While most of these mutations have no noticeable effect, they occasionally land in important genes and can contribute to differences in health or physical traits between co-twins.

Mitochondrial DNA Tells a Separate Story

Inside nearly every cell, tiny structures called mitochondria carry their own small loop of DNA, separate from the 6 billion base pairs in the cell’s nucleus. Mitochondrial DNA is inherited exclusively from your mother, so it traces a direct maternal line. Different ethnic groups carry distinct mitochondrial profiles that reflect the accumulated variations passed down from a common maternal ancestor often called “mitochondrial Eve.”

Mitochondrial DNA exists in hundreds to thousands of copies per cell (the exact number depends on the tissue’s energy demands), and mutations can appear in some copies but not others within the same person. These mutations are linked to a range of energy-metabolism disorders and can also contribute to complex traits. Because mitochondrial DNA follows a completely different inheritance pattern than nuclear DNA, it adds another independent layer of variation between individuals.

How Forensics Exploits Your Uniqueness

The science of DNA profiling takes advantage of the most variable parts of the genome: short tandem repeats (STRs). These are stretches where a short sequence of letters (like TATT) repeats back to back, and the number of repeats varies widely from person to person. STRs sit in non-coding regions, so there’s little evolutionary pressure to keep them uniform, which makes them extremely diverse.

In the United States, forensic labs analyze 13 specific STR locations across the genome. The chance that an unrelated person would match someone else at all 13 locations is typically around 1 in 1 billion. By targeting only the most variable spots in the genome, forensic scientists can distinguish between virtually any two people on Earth, with identical twins being the notable exception that requires more advanced techniques.

Why a Single Reference Genome Isn’t Enough

For two decades, genetic research relied on a single reference genome assembled largely from a small number of donors. The Human Pangenome Reference Consortium is changing that by building a collection of highly accurate, complete genomes from people of diverse ancestries. Each genome in the project covers more than 99% of the expected sequence with over 99% accuracy.

The goal is to capture the full range of human variation rather than comparing everyone to one template. As consortium member David Haussler of UC Santa Cruz put it, the field is moving from “genomics of the one standard human genome” to “genomics for everybody.” This broader reference is already improving researchers’ ability to detect the structural variants and population-specific differences that a single reference genome simply misses.