Sequencing is the process of reading the exact order of chemical building blocks, called bases, that make up a DNA molecule. Every strand of DNA is built from just four bases, and their precise arrangement determines everything from eye color to disease risk. By decoding that arrangement, scientists can identify which stretches of DNA contain genes, which stretches regulate those genes, and which changes might cause illness.
How DNA Sequencing Works
At its core, every major sequencing method relies on the same principle. An enzyme called DNA polymerase copies a strand of DNA one base at a time. Each base carries a fluorescent label, so as it’s added to the growing strand, a detector reads the color and records which base was just incorporated. Run that process across an entire stretch of DNA and you get a readout of the full sequence, letter by letter.
The original method, developed by Frederick Sanger in the 1970s, reads one fragment of DNA at a time. It’s highly accurate and still used when researchers need to examine a single gene. But it’s slow. Modern sequencing, often called next-generation sequencing (NGS), runs millions of these reactions simultaneously. Where Sanger sequencing might produce 50 to 100 reads per sample, NGS generates tens to hundreds of thousands. That leap in volume is what made it practical to sequence entire human genomes rather than just individual genes.
Short-Read vs. Long-Read Sequencing
Most next-generation sequencers break DNA into short fragments, typically a few hundred bases long, and read each one. These short reads are extremely accurate at the individual base level, but they produce a fragmented picture that has to be stitched back together using software. Highly repetitive regions of DNA, where the same pattern appears over and over, are difficult to reconstruct from short pieces alone.
Newer platforms from companies like Oxford Nanopore and Pacific Biosciences take a different approach. They read much longer stretches of DNA in a single pass, sometimes tens of thousands of bases or more. Long reads can span repetitive regions entirely, producing complete genome assemblies without gaps. This is especially valuable for tracking antibiotic resistance genes in bacteria, since those genes often sit on small circular DNA structures called plasmids that short reads can’t fully resolve. The tradeoff is that long-read sequencing historically had higher error rates per base, though accuracy has improved rapidly.
Beyond DNA: RNA Sequencing
Your DNA is essentially the same in every cell of your body, but not every gene is active in every cell. RNA sequencing (RNA-Seq) measures which genes are actually turned on and how active they are at a given moment. Researchers convert RNA molecules back into DNA, then sequence those copies using the same high-throughput machines.
The result is a snapshot of a cell’s activity, called its transcriptome. RNA-Seq can identify every type of RNA present, quantify how much of each exists, and detect alternative forms of genes that get assembled differently in different tissues. This makes it a powerful tool for understanding why a liver cell behaves differently from a brain cell despite sharing identical DNA, or how a tumor cell’s gene activity diverges from normal tissue.
Sequencing in Medical Diagnosis
One of the most direct applications of sequencing is diagnosing genetic diseases. Whole exome sequencing reads the roughly 1 to 2 percent of the genome that codes for proteins, which is where most disease-causing mutations occur. It has proven especially effective for neurological conditions, seizure disorders, skeletal abnormalities, and developmental delays in children. In some cases, identifying the exact mutation changes treatment entirely. Children with a specific form of epilepsy caused by a mutation in one gene, for example, don’t respond to standard seizure medications but improve dramatically with high-dose vitamin B6.
Sequencing also plays a growing role in cancer care. Rather than requiring a surgical biopsy, doctors can now draw blood and sequence fragments of tumor DNA circulating in the bloodstream. These “liquid biopsies” are being used to monitor treatment response in lung, breast, and colorectal cancers. The sequencing identifies specific mutations in the tumor, which can guide decisions about targeted therapies. Multiple clinical trials are evaluating whether tracking these circulating DNA fragments can detect cancer recurrence earlier than traditional imaging.
Pharmacogenomics
Your genetic makeup also affects how you metabolize medications. The FDA maintains a table of gene-drug associations where a patient’s genetic variant is likely to alter how a drug works in their body, affecting both effectiveness and the risk of side effects. Sequencing these specific genes before prescribing certain medications allows doctors to choose the right drug at the right dose from the start, rather than relying on trial and error.
Sequencing the Microbiome
Not all sequencing targets human DNA. The trillions of bacteria living in your gut, on your skin, and throughout your body collectively form your microbiome, and sequencing is the primary tool for studying it. Two approaches dominate this field.
The faster, cheaper method sequences a single bacterial gene (the 16S ribosomal RNA gene) that varies between species. It works well with relatively few reads, around 18,000 to 20,000 per sample, and gives a good overview of which bacterial groups are present. The more comprehensive approach, called shotgun metagenomics, breaks up all the DNA in a sample and sequences everything. This captures not just which organisms are present but what functional genes they carry, revealing what the microbial community is actually capable of doing. Shotgun sequencing detects significantly more species than 16S, particularly the rarer ones. Research has shown that these less abundant organisms, visible only through shotgun sequencing, are biologically meaningful and can distinguish between healthy and diseased states just as well as the more common species.
The Falling Cost of Sequencing
The Human Genome Project, completed in 2003, cost roughly $2.7 billion. By mid-2015, a high-quality draft of a human genome cost just over $4,000, and by late that year it had dropped below $1,500. Today, several companies offer whole genome sequencing for well under $1,000. That dramatic cost reduction, outpacing even Moore’s Law in computing, is what moved sequencing from elite research labs into hospitals, clinics, and even direct-to-consumer services.
Privacy and Genetic Data
As sequencing becomes routine, the question of who has access to your genetic data grows more urgent. Unlike a password or credit card number, your genome is permanent. It cannot be changed if it’s exposed. It’s also unique enough to re-identify you even if your name and other details have been stripped away, especially when combined with publicly available ancestry databases or genetic information shared by relatives.
Exposed genetic data can reveal disease risk, physical traits, mental health predispositions, and even hidden family relationships. The Genetic Information Nondiscrimination Act (GINA) prohibits health insurers and employers from using genetic information against you, but the law doesn’t cover life insurance, disability insurance, or long-term care insurance. The National Institute of Standards and Technology has developed frameworks specifically for genomic data security, recommending encryption of genetic data both when it’s stored and when it’s transmitted, along with continuous monitoring of who accesses it. If you’re considering any form of genetic testing, it’s worth understanding the privacy policies of the company or institution handling your data before you provide a sample.

