What Is the Sanger Method of DNA Sequencing?

The Sanger method is a technique for reading the exact sequence of letters in a strand of DNA. Developed by Frederick Sanger and colleagues in 1977, it works by using modified DNA building blocks that stop the copying process at random points, generating fragments of every possible length. Those fragments are then sorted by size to reveal the order of bases, one by one. It remains the gold standard for accuracy in DNA sequencing and is still widely used in labs today.

How Chain Termination Works

DNA is built from four chemical bases: adenine (A), cytosine (C), guanine (G), and thymine (T). Normally, when a cell copies DNA, an enzyme called DNA polymerase grabs free-floating building blocks (called nucleotides) and snaps them onto a growing strand. Each new nucleotide attaches through a chemical bond at a specific spot on its sugar molecule, the 3′ hydroxyl group. That bond is what allows the next nucleotide to be added, extending the chain.

The Sanger method exploits this by introducing a small quantity of altered building blocks, called dideoxynucleotides (ddNTPs), into the mix alongside the normal ones. These modified nucleotides are missing that critical 3′ hydroxyl group. When DNA polymerase happens to grab a ddNTP instead of a normal nucleotide, it slots into place just fine, but the chain can go no further. No hydroxyl group means no bond site for the next nucleotide. The strand is terminated at that exact position.

Because the ddNTPs are present in much lower concentrations than the normal nucleotides, termination is random. In any given copy of the DNA, the chain might stop at the fifth base, or the fiftieth, or the five-hundredth. Run millions of these reactions simultaneously and you end up with a collection of fragments representing every possible stopping point in the sequence. The shortest fragment tells you the identity of the first base; the longest tells you the last.

The Process Step by Step

A Sanger sequencing run begins with the DNA you want to read, a short starter piece called a primer that tells the polymerase where to begin, and the mixture of normal nucleotides plus a small proportion of chain-terminating ddNTPs. In modern versions of the method, each of the four ddNTPs (one for A, C, G, and T) is tagged with a different fluorescent color.

The DNA polymerase copies the target strand, incorporating normal nucleotides until it randomly picks up a fluorescent ddNTP and stops. This happens millions of times in parallel, producing a pool of fragments that differ in length by a single base. Each fragment carries a fluorescent tag on its final nucleotide, identifying which base caused the termination.

Next, those fragments are separated by size using capillary electrophoresis. The mixture is pushed through a very thin tube filled with a gel-like substance. Shorter fragments move faster through the gel; longer ones lag behind. As each fragment reaches the end of the capillary, a laser hits it and excites the fluorescent tag. A detector records the color, and software translates the sequence of colors into a sequence of DNA bases.

Reading a Chromatogram

The output of a Sanger sequencing run is a chromatogram (sometimes called an electropherogram), a graph showing a series of colored peaks. Each peak represents one base position in the DNA sequence. The color identifies the base (for instance, blue for C, green for A), and the height of the peak reflects signal strength. Clean, well-separated, tall peaks indicate high-quality data.

Chromatograms are not always pristine. Sometimes two colored peaks appear at the same position, which can mean the sample contains two different versions of a gene at that spot (a heterozygous site). In a person who carries one copy of a mutation and one normal copy, you would see a double peak: two colors overlapping at the position where the sequences differ. Double peaks can also signal contamination or technical problems, so interpreting them requires context. Peak shape, height relative to background noise, and consistency with surrounding peaks all factor into whether the call is reliable.

Accuracy and Read Length

Sanger sequencing is considered the gold standard for DNA sequencing accuracy. It is so reliable that when newer, faster technologies detect a genetic variant with potential medical implications, labs typically confirm it with a Sanger run before reporting results. This validation role reflects the method’s reputation: it produces clean, highly trustworthy reads for targeted stretches of DNA.

A single Sanger reaction typically reads between 400 and 900 bases of DNA, with most runs producing usable data in the range of 700 to 800 bases. Quality tends to be highest in the middle of the read and drops off at the very beginning and end. For sequences longer than about 900 bases, you need to design multiple overlapping primers and stitch the reads together.

What Sanger Sequencing Costs

Commercial Sanger sequencing is remarkably affordable for small-scale work. A single reaction at a core sequencing facility typically costs between $2.50 and $6.00 per sample, depending on the institution, the difficulty of the template, and whether you need a standard or extended-length read. For checking a single gene variant or confirming a short stretch of sequence, it is fast and cheap. The economics shift dramatically, however, when you scale up. Sequencing an entire human genome with Sanger, base by base, would require millions of individual reactions, making it impractical for large-scale projects.

Where Sanger Fits Today

Next-generation sequencing (NGS) platforms can read millions of DNA fragments simultaneously, making them the tool of choice for whole-genome projects, cancer panels, and large-scale research. But Sanger sequencing has not been replaced. It fills a different niche: targeted, high-confidence reads of specific regions.

Common modern uses include confirming variants flagged by NGS before they are reported in a clinical setting, sequencing individual genes in diagnostic tests, verifying that a cloned piece of DNA matches expectations in a research lab, and identifying bacterial or viral species by sequencing a short marker gene. In clinical genetics, the stakes of a wrong call can be enormous for patients and families, which is why Sanger confirmation remains standard practice for medically significant variants.

The method’s main limitation is throughput. Each capillary reads one fragment at a time, so scaling to thousands or millions of targets is slow and expensive compared to NGS. For single-gene questions, Sanger is still often the fastest path to a reliable answer. For genome-wide questions, it simply cannot compete.

The Discovery That Started Modern Genomics

Frederick Sanger published the dideoxy method in December 1977, the same year Allan Maxam and Walter Gilbert independently published a chemical approach to sequencing. Sanger’s method proved easier to automate and quickly became dominant. Its first major demonstration was the complete sequence of bacteriophage φX174, a virus that infects bacteria. That sequence was a landmark not just technically but scientifically: it revealed unexpected features of genetic organization that surprised researchers and demonstrated that reading raw DNA could tell a story far richer than anyone had anticipated. Sanger was awarded his second Nobel Prize in Chemistry in 1980, in part for this work. The method went on to underpin the Human Genome Project, which produced the first draft of the human genome in 2001.