How Does the Cell Make RNA: From Transcription to mRNA

Cells make RNA by reading the DNA sequence of a gene and building a complementary RNA copy, one nucleotide at a time. This process, called transcription, happens inside the nucleus and involves an enzyme called RNA polymerase that moves along the DNA strand at speeds between 1,250 and 3,500 nucleotides per minute in human cells. For protein-coding genes, the raw RNA copy then goes through several rounds of editing and chemical modification before it’s ready to leave the nucleus.

The Three RNA-Making Enzymes

Human cells don’t use a single enzyme to make all their RNA. They have three different versions of RNA polymerase, each dedicated to specific jobs. RNA polymerase II is the one that copies protein-coding genes into messenger RNA (mRNA), the type of RNA that carries instructions for building proteins. RNA polymerase I makes the large ribosomal RNAs that form the structural core of ribosomes, the cell’s protein-building machines. RNA polymerase III handles transfer RNAs (which carry amino acids during protein assembly), one small ribosomal RNA, and a handful of other small functional RNAs.

This division of labor lets the cell fine-tune production of each RNA type independently. When a cell needs to grow quickly, it can ramp up ribosomal RNA production through polymerase I without disrupting messenger RNA output from polymerase II.

How Transcription Starts

Before RNA polymerase can begin copying a gene, the cell has to mark the starting point. Every gene has a stretch of DNA just upstream called the promoter region, spanning roughly 100 nucleotides around the spot where transcription will begin. Many promoters contain a short sequence called the TATA box, named for its repeating pattern of T and A nucleotides, located about 30 nucleotides before the start site.

The startup sequence works like an assembly line. First, a protein called TBP (TATA-binding protein) latches onto the TATA box and bends the DNA at that spot. This bent DNA acts as a landing pad. A second protein, TFIIB, recognizes and locks onto the TBP-DNA complex, stabilizing it. TFIIB then recruits RNA polymerase II itself, which arrives paired with yet another helper protein. Finally, two more factors join the complex, completing what’s called the preinitiation complex. Only after this entire molecular scaffold is assembled can the enzyme begin unwinding the DNA and reading it.

Building the RNA Strand

Once transcription starts, RNA polymerase separates the two DNA strands and uses one as a template. The enzyme reads the template strand and selects the matching RNA nucleotide: where the DNA has a C, the enzyme inserts a G; where it has a T, the enzyme inserts an A; and so on. Each new nucleotide is chemically bonded to the growing RNA chain through a reaction that releases a small molecule called pyrophosphate, which provides the energy driving the process forward.

Inside the enzyme, about nine nucleotides of the freshly made RNA remain paired with the DNA template, forming a short hybrid structure enclosed by the protein. As the polymerase moves forward, the RNA peels away from the DNA and threads through a channel in the enzyme, emerging about 14 nucleotides from the growing end. The DNA strands re-seal behind the enzyme, restoring the double helix.

The speed of this process varies dramatically. RNA polymerase II starts relatively slowly, synthesizing about 500 nucleotides per minute in the first 10,000 to 15,000 nucleotides of a gene. It then accelerates to 2,000 to 4,000 nucleotides per minute as it moves deeper into the gene body. Speed also varies between genes, ranging from 370 to 3,570 nucleotides per minute depending on the specific gene being copied.

How Transcription Ends

Stopping transcription is more complicated than starting it. For protein-coding genes, the RNA polymerase reads past a signal sequence in the DNA called the polyadenylation signal. This signal doesn’t stop the polymerase directly. Instead, the processing machinery cuts the RNA at that signal site while the polymerase continues downstream. The polymerase then slows and pauses, sometimes helped by proteins that bind to the DNA and act as roadblocks. Eventually, release factors use chemical energy to pry the polymerase off the DNA template, freeing both the enzyme and the remaining RNA. The hallmark of termination is the release of the new RNA from the enzyme-DNA complex.

Editing the Raw Transcript

The RNA that comes off the polymerase isn’t finished. In eukaryotic cells, the raw copy (called pre-mRNA) goes through three major modifications before it can function.

Adding the 5′ Cap

Almost immediately after transcription begins, the front end of the RNA receives a chemical cap: a modified guanosine molecule attached by an unusual bond that faces the opposite direction from the rest of the RNA chain. This cap is then tagged with a methyl group to produce the mature structure. The cap serves multiple purposes. It protects the RNA from being chewed up by enzymes, it’s required for proper splicing and processing of the rest of the molecule, and it’s essential for exporting the finished mRNA out of the nucleus. Later, in the cytoplasm, the cap helps ribosomes latch onto the mRNA to begin translating it into protein.

Removing Introns by Splicing

Most human genes are interrupted by long stretches of non-coding DNA called introns, and these sequences get copied into the pre-mRNA. They need to be cut out precisely, and the remaining coding segments (exons) stitched together. This job falls to the spliceosome, a large molecular machine built from five small RNA molecules and dozens of proteins.

The spliceosome identifies intron boundaries using short conserved sequences at each end of the intron plus an internal landmark called the branch site, typically located 18 to 40 nucleotides upstream from the intron’s end. Removal happens through two precise cutting-and-joining reactions. In the first, the branch site attacks the front end of the intron, cutting it free from the upstream exon and forming a loop (called a lariat). In the second, the freed upstream exon attacks the back end of the intron, joining the two exons together and releasing the looped-out intron for recycling.

Adding the Poly(A) Tail

At the back end of the RNA, the cell adds a long chain of adenine nucleotides, typically 100 to 250 of them. This poly(A) tail protects the mRNA from degradation, promotes its export from the nucleus, and later helps initiate translation in the cytoplasm. The tail gradually shortens over the mRNA’s lifetime, and its length effectively acts as a timer determining how long the message survives.

Getting the Finished mRNA Out

A completed mRNA doesn’t travel naked through the nucleus. It’s coated with proteins throughout processing, forming a structure called a messenger ribonucleoprotein particle (mRNP). For the mRNP to leave the nucleus, it must pass through nuclear pore complexes, massive protein channels that span the nuclear envelope.

The key export receptor in human cells is a protein pair called NXF1/NXT1 (also known as TAP/p15). These proteins are loaded onto the mRNP as a final quality-control checkpoint, essentially certifying that the RNA has been properly capped, spliced, and polyadenylated. The loaded mRNP then docks at the nuclear pore and threads through its central channel by interacting with gel-like protein filaments lining the pore’s interior. Once through, the mRNP is released into the cytoplasm, where ribosomes can begin translating it into protein.

How Bacteria Do It Differently

Bacterial cells lack a nucleus, which changes the entire workflow. In bacteria, transcription and translation happen in the same compartment, and they happen simultaneously. A ribosome can latch onto the front of an mRNA and start building protein while the back end of that same mRNA is still being transcribed. This coupling of transcription and translation means bacterial mRNAs don’t need the elaborate processing steps that eukaryotic mRNAs require: no capping, no poly(A) tail, and very little splicing.

Bacteria also use only a single type of RNA polymerase for all their RNA, compared to the three specialized versions in eukaryotic cells. This simpler system allows bacteria to respond extremely quickly to environmental changes, spinning up new proteins within minutes of detecting a signal.

RNA Made Outside the Nucleus

Not all RNA in a human cell is made in the nucleus. Mitochondria, the energy-producing compartments inherited from an ancient bacterial ancestor, carry their own small genome and transcribe it using their own RNA polymerase called POLRMT. This enzyme is evolutionarily distinct from the nuclear RNA polymerases and more closely resembles the single-subunit polymerases found in certain viruses. Mitochondrial transcription produces the RNA needed to build a handful of proteins essential for energy production, along with the ribosomal and transfer RNAs needed to translate those messages on mitochondrial ribosomes.