What Occurs During Transcription: DNA to RNA

Transcription is the process where your cells copy a gene’s DNA sequence into RNA, creating the molecular instructions needed to build proteins and carry out other functions. It happens in three main stages: initiation, elongation, and termination. Each stage involves a coordinated set of molecular events that determine which genes get read, how fast, and how accurately.

How Transcription Begins

Transcription starts when the enzyme RNA polymerase locates and binds to a specific stretch of DNA called the promoter. In many genes, the promoter contains a short sequence known as the TATA box, with the consensus sequence TATAAA, positioned about 25 to 30 nucleotides before the point where transcription actually begins. But the TATA box is far from universal. Studies in fruit flies found that only about 29% of gene promoters rely on a TATA box alone, while 31% don’t appear to use one at all.

RNA polymerase can’t simply land on the promoter by itself. In eukaryotic cells (the type that make up your body), a team of helper proteins called general transcription factors must assemble first. These factors, labeled TFIIA through TFIIH, build what’s called a pre-initiation complex in a specific order. TFIID arrives first and recognizes the promoter sequence. TFIIB joins next, followed by RNA polymerase paired with TFIIF, then TFIIE, and finally TFIIH. Once this full complex is assembled, the two strands of the DNA double helix are pried apart, creating a small open bubble where the RNA copy can begin.

The Transcription Bubble

As RNA polymerase moves along the gene, it maintains a small region of separated DNA strands called the transcription bubble. Inside this bubble, one strand of DNA serves as the template, and RNA polymerase reads it to build a complementary RNA strand. Ahead of the bubble, the DNA unwinds; behind it, the two DNA strands snap back together. This constant unwinding and rewinding creates physical tension in the DNA, similar to what happens when you try to pull apart the middle of a twisted rope. Enzymes called topoisomerases relieve this tension so the polymerase can keep moving without the DNA getting tangled.

Elongation: Building the RNA Strand

Once the bubble is open and the first few RNA nucleotides are linked together, the polymerase shifts into elongation mode. This is the core of transcription: the polymerase reads the DNA template one base at a time and adds matching RNA nucleotides to the growing strand, always building in the 5′ to 3′ direction.

Each cycle of nucleotide addition involves four steps. First, a free nucleotide drifts into the active site of the polymerase. Second, a chemical bond forms between the new nucleotide and the end of the growing RNA chain. Third, a small byproduct called pyrophosphate is released. Fourth, the polymerase shifts forward by one position along the DNA, ready to repeat the cycle. Research suggests that both the bond-forming step and the forward movement step are relatively slow compared to the others, meaning they jointly control the overall pace of transcription.

That pace varies dramatically. Averaged estimates for RNA polymerase II in mammalian cells range from about 1,300 to 4,300 nucleotides per minute, with some measurements on specific genes suggesting rates above 50,000 nucleotides per minute. The typical estimate for a standard human gene falls around 3,000 to 4,000 nucleotides per minute.

Three RNA Polymerases, Three Jobs

Eukaryotic cells don’t rely on a single RNA polymerase. They use three distinct versions, each dedicated to different types of RNA. RNA polymerase I transcribes the genes for ribosomal RNA, the structural backbone of the cell’s protein-making machinery. RNA polymerase II handles messenger RNA (the type that carries protein-coding instructions) along with several smaller regulatory RNAs. RNA polymerase III makes transfer RNA, which delivers amino acids during protein construction, and one specific type of ribosomal RNA. Bacteria, by contrast, use a single RNA polymerase for all their transcription.

How Transcription Errors Are Caught

RNA polymerase doesn’t have the same rigorous proofreading systems that DNA-copying enzymes use, but it’s not careless either. In laboratory conditions, the error rate is roughly 1 to 2 mistakes per 100,000 bases. The actual rate inside living cells is still debated.

The polymerase uses several layers of quality control. One mechanism, called “look-ahead,” helps the enzyme reject a wrong nucleotide before it’s even chemically processed. Another, called hydrolysis rejection, catches an incorrect nucleotide after processing but before it’s permanently linked into the RNA chain. If a wrong base does get incorporated, the polymerase can backtrack along the DNA, reversing the addition and removing the mistake. This backtracking mechanism is the last line of defense and requires energy in the form of ATP.

How Transcription Ends

Termination works differently in bacteria and eukaryotic cells. Bacteria use two main strategies. In the first, called intrinsic termination, the newly made RNA folds into a hairpin-shaped loop. This structure, combined with a weak stretch of bonds between the RNA and DNA template, destabilizes the complex enough that the polymerase falls off and the RNA is released. In the second strategy, a ring-shaped protein called Rho latches onto the growing RNA at specific loading sites, then chases down the polymerase using energy from ATP. When Rho catches up, it pulls the RNA free and forces the polymerase to detach. Recent work shows Rho can also interact directly with the polymerase itself, suggesting the process is more flexible than originally thought.

In eukaryotic cells, termination is tied to RNA processing. As the polymerase transcribes past a signal sequence (typically AAUAAA in the RNA), a group of proteins clamps onto the RNA and cuts it. A chain of adenine nucleotides, called a poly-A tail, is then added to the cut end. The polymerase continues transcribing for a short distance past the cut site before eventually dissociating from the DNA, though the exact mechanism that finally dislodges it involves the remaining RNA strand being degraded by a chasing enzyme.

Processing the RNA in Eukaryotes

In eukaryotic cells, the initial RNA transcript (called pre-mRNA) isn’t ready for use right away. It undergoes three major modifications, and remarkably, most of these happen while the RNA is still being transcribed.

First, a protective chemical cap is added to the front (5′) end of the RNA shortly after transcription begins. This cap helps the cell’s protein-making machinery recognize the RNA later and protects it from being degraded. Second, non-coding segments called introns are cut out and the remaining coding segments (exons) are spliced together. A single human gene can contain dozens of introns, so this editing step is essential. Third, the poly-A tail is added to the back end after cleavage, as described above. This tail further stabilizes the RNA and aids its export from the nucleus to the cytoplasm, where proteins are made. All three of these processing events are stimulated by a flexible tail on RNA polymerase II itself, called the CTD, which acts as a landing platform for the processing machinery.

How Gene Activity Is Controlled

Not every gene is transcribed all the time. Cells regulate which genes are active through proteins called transcription factors that bind to regulatory DNA sequences. Some of these sequences, called enhancers, can sit thousands of nucleotides away from the gene they control. Activator proteins bound to enhancers communicate with the promoter, likely by looping the DNA so that distant regions come into physical contact. These activators can boost transcription either by helping recruit RNA polymerase II to the promoter or by speeding up elongation once the polymerase is already moving.

Bacteria Link Transcription and Translation

One of the biggest differences between bacterial and eukaryotic transcription is what happens to the RNA afterward. In eukaryotic cells, transcription occurs inside the nucleus, and the finished RNA must be exported to the cytoplasm before it can be translated into protein. Bacteria have no nucleus. Their DNA, RNA polymerase, and ribosomes all share the same space, which means a ribosome can latch onto the RNA and start building a protein while the RNA is still being transcribed. This coupling of transcription and translation gives bacteria extraordinary speed in responding to environmental changes.