What Does UAG Code For? The Amber Stop Codon

UAG does not code for an amino acid. It is one of three stop codons in the genetic code, signaling the ribosome to halt protein synthesis and release the finished protein. The other two stop codons are UAA and UGA. Together, these three RNA triplets mark the end of every protein-coding sequence.

UAG is also known as the “amber” stop codon, a nickname that has stuck in molecular biology for decades. While its primary job is termination, UAG has a few fascinating exceptions and has become a key tool in modern genetic engineering.

How UAG Stops Protein Synthesis

During translation, the ribosome reads messenger RNA three letters at a time. Most three-letter combinations (codons) correspond to a specific amino acid, which gets added to the growing protein chain. UAG is different. When the ribosome encounters UAG, no amino acid-carrying molecule shows up. Instead, a protein called a release factor binds to the ribosome and triggers the release of the newly built protein.

In bacteria, the release factor that recognizes UAG is called RF1. This same factor also recognizes the UAA stop codon. The other bacterial release factor, RF2, handles UAA and UGA. In human cells and other eukaryotes, a single release factor (eRF1) recognizes all three stop codons. Crystal structures of RF1 bound to the ribosome at a UAG codon have revealed exactly how the release factor reads the three-letter sequence and distinguishes it from codons that encode amino acids.

Why It’s Called the Amber Codon

Each stop codon has a color-themed nickname: UAG is “amber,” UAA is “ochre,” and UGA is “opal.” These names trace back to early genetics research in the 1960s, when scientists studying bacteriophage (viruses that infect bacteria) identified mutations that created premature stop codons. The “amber” name reportedly comes from the last name of one of the researchers involved, Harris Bernstein, whose surname translates to “amber” in German. The other two stop codons were given gemstone-related names to match.

How Common Is UAG in Human Genes?

The three stop codons are not used equally across the genome. UAA tends to be the most common stop codon in highly expressed genes and in genes with AT-rich coding sequences. UGA is heavily favored in GC-rich regions of the genome. UAG falls in between, and its frequency grows less steeply than UGA as GC content rises. The reasons for these preferences involve a mix of mutational patterns and natural selection acting on how efficiently translation terminates, though the full picture is still being worked out.

What Happens When UAG Appears Too Early

If a mutation converts a normal codon into UAG somewhere in the middle of a gene, the result is a premature stop signal. The ribosome halts early, producing a shortened, usually nonfunctional protein. These are called nonsense mutations, and they are responsible for severe forms of many genetic diseases.

Cystic fibrosis, hemophilia A and B, Fabry disease, Usher syndrome, and certain cancers involving the p53 tumor suppressor gene can all be caused by nonsense mutations. The core problem is the same in each case: the truncated protein either doesn’t work at all or gets flagged for destruction by a cellular quality-control system before it can do anything. The loss of a functional protein is what drives the disease.

The Exception: When UAG Codes for Pyrrolysine

In a small number of organisms, UAG does encode an amino acid. Certain archaea in the family Methanosarcinaceae use UAG to insert pyrrolysine, sometimes called the 22nd amino acid, into specific enzymes. These organisms have a dedicated transfer RNA that recognizes UAG and carries pyrrolysine, along with a specialized enzyme that attaches pyrrolysine to that tRNA.

Pyrrolysine is not a universal part of the genetic code. It appears to be a late evolutionary addition, invented by one archaeal lineage to meet specific metabolic needs, particularly in enzymes called methyltransferases. Outside of these organisms (and a few bacteria that acquired the same machinery through gene transfer), UAG functions strictly as a stop signal.

UAG as a Tool in Synthetic Biology

Because UAG is the least commonly used stop codon in many organisms, synthetic biologists have seized on it as a way to expand the genetic code. The basic idea: replace all natural UAG stop codons in an organism’s genome with the synonymous UAA, then repurpose UAG to encode entirely new, non-natural amino acids.

In 2013, researchers accomplished exactly this in E. coli, swapping all 321 UAG codons in the bacterium’s genome to UAA. They then deleted the gene for release factor 1, removing the cell’s ability to treat UAG as a stop signal. With that done, UAG became an open coding channel. By introducing an engineered tRNA and a matching enzyme, scientists could insert a non-natural amino acid wherever they placed a UAG codon in a gene of interest.

This technique, called genetic code expansion, allows researchers to build proteins with chemical properties not found in nature. Applications range from creating proteins with built-in fluorescent labels to designing more stable therapeutic proteins. One challenge is efficiency: if any native UAG codons remain, the engineered tRNA tries to suppress them all, which can be toxic to the cell. The fully recoded organism sidesteps this problem by ensuring UAG only appears where the researcher intentionally places it.

Engineered ribosomes have also been developed with mutations in their small subunit that favor suppression of UAG, further improving how reliably the system incorporates the desired non-natural amino acid.