What Is a Codon? Definition, Function, and Examples

A codon is a sequence of three nucleotides (the individual “letters” of DNA or RNA) that represents one specific instruction during protein building. Each codon either tells the cell to add a particular amino acid to a growing protein or signals that the protein is complete. Your entire genetic code works through these three-letter combinations, and there are exactly 64 of them.

How Three Letters Spell Out a Protein

DNA and RNA use four chemical bases as their alphabet: adenine (A), cytosine (C), guanine (G), and either thymine (T) in DNA or uracil (U) in RNA. A codon is any combination of three of these bases read in order. Since each of the three positions can be filled by any of four bases, the math is straightforward: 4 × 4 × 4 = 64 possible codons.

Of those 64 codons, 61 code for amino acids, the building blocks of proteins. The remaining three (UAA, UAG, and UGA) are stop codons. They don’t code for any amino acid. Instead, they act like a period at the end of a sentence, telling the cell’s protein-making machinery that the job is done. One specific codon, AUG, doubles as both the code for the amino acid methionine and the universal start signal that kicks off protein assembly.

Why 64 Codons Make Only 20 Amino Acids

With 61 codons coding for just 20 amino acids, simple arithmetic tells you that most amino acids are represented by more than one codon. This overlap is called redundancy (or degeneracy, in genetics terminology). For example, the amino acids leucine and serine each have six different codons. Only tryptophan and methionine are encoded by a single codon each.

This built-in redundancy acts as a buffer against mutations. If a single base in a codon changes, the new codon may still code for the same amino acid, producing no effect on the final protein. These are called silent mutations. When a base change does swap in a different amino acid, that’s a missense mutation, which may or may not affect how the protein works. The most damaging single-letter changes are nonsense mutations, where the new codon becomes one of the three stop signals, cutting the protein short. Only 21 of the 64 possible codons can be turned into a stop codon by changing a single base.

From Gene to Protein: Codons in Action

Protein synthesis happens in two major stages. First, a gene’s DNA sequence is copied into a messenger RNA (mRNA) strand, a process called transcription. Then, the cell’s ribosomes read that mRNA three letters at a time, each codon triggering the addition of one amino acid to the growing protein chain. This second stage is translation.

The key player that matches each codon to its amino acid is transfer RNA (tRNA). Every tRNA molecule carries a specific amino acid on one end and a three-letter anticodon on the other. When a tRNA’s anticodon pairs up with a matching codon on the mRNA, it delivers its amino acid to the chain. The ribosome checks this match carefully: it physically senses the shape of the bond between the codon and anticodon at the first two positions, accepting only correct pairings. At the third position, the rules are looser. This “wobble” position tolerates some mismatches, which is part of why multiple codons can code for the same amino acid.

The ribosome doesn’t just passively watch this happen. It actively verifies correct pairing through a two-step proofreading system that uses the energy difference between correct and incorrect matches twice, dramatically reducing errors.

Not All Codons Are Used Equally

Even though multiple codons can specify the same amino acid, cells don’t use all of them at the same rate. This phenomenon, called codon usage bias, appears in bacteria, plants, and animals. Cells tend to favor certain “optimal” codons that correspond to whichever tRNA molecules are most abundant in the cell.

The practical effect is speed. When a ribosome encounters an optimal codon, there’s a readily available tRNA nearby, so translation moves quickly. When it hits a rare codon, it has to wait for the right tRNA to show up, slowing down. Stretches of rare codons can create a traffic jam of ribosomes, which may even block new proteins from being started on that same mRNA. Highly expressed genes, the ones a cell relies on heavily, tend to use optimal codons almost exclusively. This bias influences not just how much protein gets made but also how the protein folds into its final three-dimensional shape, since the pace of translation affects how the chain coils up as it emerges from the ribosome.

This has real-world applications in biotechnology. When scientists want to produce large amounts of a human protein in bacteria, they often redesign the gene’s codon sequence to match the host organism’s preferred codons, boosting production dramatically.

The Genetic Code Isn’t Quite Universal

The standard codon table you’ll find in any biology textbook applies to the vast majority of life on Earth. But there are exceptions, particularly inside mitochondria, the energy-producing structures within cells that carry their own small genome.

The first deviations from the standard code were discovered in human mitochondria in 1979, and many more have been found since. In the mitochondria of sea stars and sea urchins, for instance, the codons AGA and AGG code for the amino acid serine rather than arginine, as they would under the standard code. Invertebrate mitochondria use up to five alternative start codons beyond the standard AUG. Even the bacterium E. coli has been found to use as many as 47 possible start codons. These variations are relatively minor, though. The core logic of codons, three-letter sequences read in order, is the same everywhere.

Frameshift Mutations: When the Reading Shifts

Because codons are read in consecutive groups of three with no gaps or punctuation between them, the starting point matters enormously. If a single base gets inserted or deleted from the sequence, every codon downstream shifts by one position. This is a frameshift mutation, and it’s typically catastrophic for the protein. The entire amino acid sequence after the insertion or deletion changes, and the ribosome usually hits a premature stop codon in the new, garbled reading frame, producing a truncated, nonfunctional protein.

This is different from a simple substitution, where only one codon (and at most one amino acid) is affected. Frameshifts illustrate why the triplet nature of codons is so fundamental: the entire system depends on reading the right three letters at a time, in the right order, from the right starting point.

How the First Codon Was Cracked

The relationship between codons and amino acids was a mystery until 1961, when Marshall Nirenberg and Heinrich Matthaei at the National Institutes of Health ran an elegant experiment. They created a synthetic RNA strand made entirely of uracil (poly-U), then added it to a test tube containing the protein-building machinery extracted from E. coli bacteria. The result: the system churned out a chain made entirely of the amino acid phenylalanine. UUU was the first codon ever decoded. Over the following years, researchers used similar techniques to assign amino acids to all 64 codons, completing the genetic code by the mid-1960s.