Transcription is the first major step in protein synthesis, where a segment of DNA is copied into a messenger RNA (mRNA) molecule. This mRNA then carries the genetic instructions from DNA to the cell’s protein-building machinery. Because DNA and RNA are chemically similar, the DNA strand acts as a direct template for building an RNA copy through complementary base-pairing.
The entire process of making a protein from a gene happens in two stages: transcription (DNA to mRNA) and translation (mRNA to protein). Transcription is where the information transfer begins.
Where Transcription Happens
In human and animal cells, transcription takes place inside the nucleus, where the DNA is stored. The finished mRNA must then travel out of the nucleus to the cytoplasm, where ribosomes translate it into protein. This physical separation means transcription and translation happen as distinct, sequential events in eukaryotic cells.
Bacteria work differently. They have no nucleus, so transcription and translation both occur in the cytoplasm. Ribosomes can latch onto an mRNA strand and start building protein while the mRNA is still being transcribed from DNA. This simultaneous processing is one reason bacteria can respond so quickly to environmental changes.
How Transcription Starts
Transcription begins when the cell’s molecular machinery identifies a specific region of DNA called a promoter. In many genes, the promoter contains a short sequence known as the TATA box (an eight-letter DNA pattern) that serves as a landing pad. A protein called TATA-binding protein recognizes this sequence and helps recruit RNA polymerase, the enzyme responsible for building the RNA strand. Several additional helper proteins, called general transcription factors, assemble together with RNA polymerase at the promoter to form what’s known as the preinitiation complex.
Not all genes have a TATA box. Many promoters rely on alternative short sequences that are recognized by different parts of the same recruitment machinery, ensuring that even TATA-less genes can still be transcribed.
The Enzyme That Builds mRNA
RNA polymerase is the central enzyme of transcription, but eukaryotic cells actually have three types, each with a different job. RNA polymerase II is the one that transcribes protein-coding genes into mRNA, making it the version most relevant to protein synthesis. RNA polymerase I produces the large ribosomal RNAs that form part of the ribosome’s structure, while RNA polymerase III makes transfer RNAs (the molecules that carry amino acids during translation) and a smaller ribosomal RNA.
Elongation: Building the RNA Strand
Once RNA polymerase is positioned on the DNA, it unwinds a small section of the double helix and begins reading one of the two DNA strands, called the template strand. It moves along this strand in the 3′ to 5′ direction, and for each DNA nucleotide it reads, it adds a complementary RNA nucleotide to the growing mRNA chain. The new RNA strand grows in the 5′ to 3′ direction, one nucleotide at a time.
The base-pairing rules are straightforward: DNA’s cytosine pairs with RNA’s guanine, guanine pairs with cytosine, thymine pairs with adenine, and adenine pairs with uracil (RNA uses uracil instead of DNA’s thymine). The result is an RNA strand that carries the same information as the non-template DNA strand, just written in RNA’s chemical alphabet.
In human cells, RNA polymerase II typically adds between 1,000 and 4,000 nucleotides per minute, though speeds can vary dramatically depending on the gene and cellular context. Some measurements have clocked rates above 50,000 nucleotides per minute under certain conditions, showing that the enzyme has a wide dynamic range.
How Transcription Ends
Transcription doesn’t continue forever. Specific signals in the DNA tell RNA polymerase to stop. In bacteria, two termination mechanisms are well understood. In one (called intrinsic termination), the newly made RNA folds into a hairpin-shaped loop followed by a string of uracil bases, which destabilizes the connection between RNA polymerase and the DNA. In the other, a ring-shaped protein called Rho catches up to the polymerase along the RNA strand and physically pulls the complex apart using energy from ATP.
Eukaryotic termination is more complex and tied closely to the processing of the mRNA’s tail end, but the principle is the same: the cell has built-in stop signals that release the finished RNA from the DNA template.
Processing the Raw mRNA
In eukaryotic cells, the RNA that comes directly off the DNA template isn’t ready for translation yet. It’s called pre-mRNA and needs three major modifications before it becomes a mature, functional messenger.
The 5′ Cap
Almost immediately after transcription begins, a modified chemical group (a 7-methylguanosine cap) is added to the front end of the RNA. This cap helps the cell’s machinery recognize the mRNA for splicing, export from the nucleus, and translation. It also protects the mRNA from being broken down prematurely.
The Poly-A Tail
At the other end, a long chain of adenine nucleotides (typically 100 to 250 of them) is added after the coding sequence is cut at a specific site. This poly-A tail promotes export from the nucleus, helps initiate translation, and shields the mRNA from degradation. As the mRNA ages in the cytoplasm, the tail gradually shortens, eventually triggering the mRNA’s destruction.
Splicing
Perhaps the most dramatic modification is splicing. Most human genes contain long stretches of non-coding DNA called introns scattered between the coding segments called exons. The pre-mRNA includes all of them, so the introns must be cut out and the exons stitched together to create a continuous coding message.
This cutting and joining is performed by a large molecular machine called the spliceosome, built from small nuclear ribonucleoproteins (often called “snurps”). The spliceosome recognizes conserved sequences at the boundaries of each intron, typically starting with the nucleotides GU at the intron’s front end and AG at its back end. It cuts the RNA at these sites, loops the intron into a lariat shape, and joins the neighboring exons together. The discarded intron lariat is then broken down and recycled.
By mixing and matching which exons are included, cells can produce different protein variants from a single gene, a process called alternative splicing. This is one reason humans can make far more proteins than they have genes.
Transcription vs. Translation
Transcription and translation are sequential steps, but they differ in nearly every detail:
- Template: Transcription reads DNA to make RNA. Translation reads mRNA to make protein.
- Location (in eukaryotes): Transcription occurs in the nucleus. Translation occurs in the cytoplasm on ribosomes.
- Key enzyme: Transcription uses RNA polymerase. Translation uses the ribosome, assisted by transfer RNAs that carry amino acids.
- Product: Transcription produces a single-stranded mRNA. Translation produces a chain of amino acids that folds into a functional protein.
- Building blocks: Transcription assembles RNA nucleotides (A, U, C, G). Translation assembles amino acids (20 different types).
When Transcription Goes Wrong
Because transcription is so fundamental, anything that disrupts it can be devastating. One of the most striking examples comes from the death cap mushroom, which produces a toxin called alpha-amanitin. This compound binds directly to RNA polymerase II and blocks it from moving along the DNA after adding a nucleotide. The enzyme can still grab individual RNA building blocks and attach them, but it can’t slide forward to make room for the next one. The result is a near-total shutdown of mRNA production. Since the liver processes the toxin first, liver failure is the primary cause of death in mushroom poisoning cases.
Some antibiotics exploit the same vulnerability in bacteria, targeting bacterial RNA polymerase to stop the pathogen from making the proteins it needs to survive, without affecting human RNA polymerase.

