How Scientists Made Molecular Biology’s Biggest Discoveries

Scientists have driven every major leap in molecular biology, from proving that DNA carries genetic information to editing individual genes with precision. These contributions span more than 80 years and have reshaped medicine, agriculture, and our basic understanding of life. Here’s how specific discoveries built on each other to create the molecular toolkit we have today.

Proving DNA Is the Blueprint of Life

Before the 1940s, most biochemists believed proteins carried genetic information. DNA seemed too chemically simple to do anything important. That assumption was overturned by Oswald Avery, Colin MacLeod, and Maclyn McCarty, who in 1944 identified DNA as what they called the “transforming principle.”

Their work built on a puzzling observation by Frederick Griffith in the 1920s. Griffith had shown that a harmless strain of pneumonia-causing bacteria could become deadly when mixed with heat-killed cells from a virulent strain. Something in the dead cells was transforming the living ones, but nobody knew what. Avery’s team systematically destroyed each candidate molecule to see which one mattered. Enzymes that broke down proteins had no effect. Enzymes that digested fats did nothing either. Enzymes targeting RNA left the transforming substance intact. Only when they targeted DNA did the transformation stop. The substance responsible was rich in nucleic acids and had a high molecular weight. They had isolated DNA as the molecule of heredity, a finding that set the stage for Watson and Crick’s structural model less than a decade later.

Copying DNA Billions of Times Over

Knowing that DNA carries genetic instructions was one thing. Working with it in the lab was another, because researchers could only study the tiny amounts of DNA they could physically extract from cells. The polymerase chain reaction, or PCR, solved that problem by making it possible to copy a specific stretch of DNA millions or billions of times from a single starting sample.

The underlying concept appeared as early as 1969, when Kjell Kleppe, working in H. Gobind Korana’s laboratory, described a theoretical cycle: separate the two strands of a DNA molecule using heat, let short primer sequences attach to each strand, then use a DNA-building enzyme to complete new copies. Repeat the cycle, and the number of copies doubles each time. It took until 1985 for the technique to be refined and published in the journal Science, after which it transformed virtually every branch of biology. PCR made forensic DNA analysis practical, allowed doctors to detect infections from tiny samples, and became the backbone of the sequencing technologies that would eventually decode the human genome.

Reading the Entire Human Genome

The Human Genome Project, launched in 1990, set out to read all three billion base pairs of human DNA. It was initially projected to cost $3 billion over 15 years. Scientists completed a draft in 2001, but roughly 8% of the genome remained unreadable because the technology at the time couldn’t handle highly repetitive stretches of DNA.

That gap persisted for two decades until a consortium called the Telomere-to-Telomere (T2T) project published the first truly complete, gapless human genome sequence in 2022. The new reference genome, called T2T-CHM13, added nearly 200 million base pairs of previously unknown DNA sequences, including 99 genes likely to code for proteins and nearly 2,000 candidate genes that still need further study. Filling in those missing regions gave scientists their first clear look at parts of chromosomes involved in cell division and other fundamental processes.

The cost of sequencing has dropped just as dramatically as the technology has improved. What once required a $3 billion international effort can now be done for a few hundred dollars per genome, putting sequencing within reach of routine clinical use.

Seeing Molecules at Atomic Detail

Understanding what a molecule does often depends on seeing its three-dimensional shape. For decades, X-ray crystallography was the primary tool, but it required scientists to coax molecules into rigid crystals, something that many large, flexible biological machines simply refused to do.

Cryo-electron microscopy, or cryo-EM, changed that. The technique flash-freezes molecules in a thin layer of ice and images them with an electron beam. Advances in detectors and image-processing software triggered what researchers call a “resolution revolution,” allowing scientists to determine atomic or near-atomic resolution structures of very large, flexible, and often short-lived molecular complexes that had resisted crystallization for decades. This was especially transformative for studying the machinery that reads and copies genes, because those complexes constantly shift shape as they work.

More recently, computational approaches have matched and even extended what microscopy can do. DeepMind’s AlphaFold system, an artificial intelligence program, demonstrated in 2020 that it could predict the three-dimensional structure of a protein from its amino acid sequence alone with accuracy rivaling experimental methods. The program was tested against 87 protein structures in a blind competition and outperformed all 146 competing entries. DeepMind then released predicted structures for over 200 million proteins, effectively giving researchers a structural catalog for nearly every known protein on Earth.

Editing Genes With Precision

Perhaps no molecular tool has generated more excitement than CRISPR-Cas9, a gene-editing system that scientists adapted from the natural immune defenses of bacteria. In nature, bacteria use CRISPR to remember and destroy the DNA of viruses that have attacked them before. A short RNA molecule, made from the bacterium’s own CRISPR sequences, guides a cutting enzyme called Cas9 to matching viral DNA and slices it apart.

The key insight that made CRISPR practical for laboratory use came when scientists demonstrated that the system targets DNA directly, not the intermediate messenger molecules cells use to read genes. Researchers then simplified the system by combining two naturally separate RNA molecules into a single “guide RNA,” making it far easier to program the system to cut any desired DNA sequence. To make the tool work in human cells, scientists had to optimize the genetic code of the Cas9 gene, add signals that direct the enzyme into the cell nucleus, and lengthen the guide RNA to improve cutting efficiency. A DNA template can also be supplied so the cell patches the cut with a specific new sequence rather than just scrambling the original.

This technology now underpins approved therapies for sickle cell disease and is being tested against dozens of other genetic conditions.

Building Life From Scratch

While most molecular biology involves studying or modifying existing organisms, one team pushed further by asking how few genes a living cell actually needs. In 2016, researchers at the J. Craig Venter Institute created JCVI-syn3.0, a synthetic minimal cell containing just 473 genes packed into a 531,000-base-pair genome. They built it through four rounds of systematic gene removal, stripping 428 genes from an earlier synthetic genome and testing whether the cell could still grow and divide after each round.

The result was a working cell with the smallest known genome of any self-replicating organism, yet roughly a third of its 473 genes had no known function. Scientists knew the cell needed those genes to survive but couldn’t explain why. That humbling finding highlighted how much remains unknown even at the most basic level of molecular biology.

Turning Molecular Findings Into Medicine

Molecular discoveries have directly changed how diseases are treated, especially cancer. Traditional chemotherapy attacks all rapidly dividing cells. Molecularly targeted therapies, by contrast, zero in on the specific proteins that drive a particular tumor’s growth.

One landmark example arrived in 2001, when a drug was approved for a form of chronic leukemia driven by an abnormal fusion of two genes. That fusion creates a hyperactive signaling protein that forces white blood cells to multiply out of control. The drug blocks that protein’s activity, and it transformed the disease from a near-certain death sentence into a manageable chronic condition for most patients. Because the same drug also blocks a few structurally similar proteins, it proved effective against certain gastrointestinal tumors as well.

Since then, the approach has expanded rapidly. Drugs now target specific mutations in lung cancer, breast cancer, leukemia, and bile duct cancer, among others. Some target tumors carrying a particular mutation in a growth-signaling gene. Others exploit weaknesses in tumors that have lost the ability to repair their own DNA. In each case, the treatment only works because scientists first identified the exact molecular defect driving that cancer, then designed a molecule to block it. The pattern repeats across medicine: a molecular finding in the lab eventually becomes a diagnostic test or a therapy in the clinic, often decades later, but with transformative results.