Transcription requires a DNA template, an enzyme called RNA polymerase, small protein helpers that guide the enzyme to the right spot, free building-block molecules called ribonucleotides, and specific metal ions that activate the enzyme’s chemistry. These components work together to read a gene in the DNA and produce a complementary strand of RNA. The details differ between bacteria and more complex organisms, but the core logic is the same.
A DNA Template Read in One Direction
Transcription starts with a single strand of the DNA double helix serving as the template. RNA polymerase reads this template strand in the 3′-to-5′ direction, assembling the new RNA molecule in the opposite direction (5′ to 3′). Each RNA nucleotide is selected by complementary base pairing with the template, so the final RNA sequence is an exact complement of the DNA it was copied from.
The other DNA strand, sometimes called the coding strand, isn’t read directly. It simply matches the RNA in sequence (with uracil replacing thymine). The direction RNA polymerase travels along the DNA determines which strand gets used as the template for any given gene.
RNA Polymerase: The Central Enzyme
RNA polymerase is the enzyme that physically builds the RNA chain. Bacteria have a single type of RNA polymerase that handles all transcription. Eukaryotic cells (plants, animals, fungi) use three distinct versions. RNA polymerase I makes ribosomal RNA, which forms the structural core of ribosomes. RNA polymerase II makes messenger RNA and several smaller regulatory RNAs. RNA polymerase III makes transfer RNA and one small ribosomal RNA. Of these, RNA polymerase II gets the most attention because it transcribes the protein-coding genes.
All RNA polymerases share a key trait: they don’t need a pre-existing primer to start synthesis. DNA polymerase, by contrast, always requires a short RNA primer. This ability to begin from scratch is part of what makes promoter recognition so important.
Promoter Sequences That Mark the Start
RNA polymerase doesn’t just land anywhere on the DNA. It needs a specific signal, called a promoter, to know where a gene begins and which direction to read. The promoter is a short stretch of DNA located just upstream of the gene.
In eukaryotes, one of the most common promoter elements is the TATA box, a short T/A-rich sequence typically found about 25 to 30 base pairs upstream of the transcription start site. The consensus sequence identified across many animal genes is TATAAA. Plant promoters use a slightly expanded version, TCACTATATATAG, where even the flanking bases play a functional role. Not every gene has a TATA box; some promoters rely on other elements like an initiator region or a downstream promoter element, or a combination of all three. Together, these elements make up the core promoter, generally spanning from about 35 base pairs before the start site to 35 base pairs after it.
In bacteria, promoters contain two key elements recognized by the sigma factor (described below): a sequence near position −10 and another near position −35 relative to the transcription start site. Because these promoter sequences are asymmetric, the polymerase can only bind in one orientation, ensuring it reads the correct strand.
Transcription Factors That Recruit the Enzyme
RNA polymerase generally can’t find the promoter on its own. It needs helper proteins to guide it into position.
In bacteria, this job falls to the sigma factor, a protein subunit that temporarily attaches to the core RNA polymerase. The sigma factor is what gives the enzyme the ability to recognize promoter sequences. It binds both the −10 and −35 elements of the promoter, positions the polymerase correctly, and then detaches once transcription is underway. Different sigma factors recognize different promoter sequences, giving the cell a way to switch on different sets of genes depending on conditions.
In eukaryotes, the system is more elaborate. Six general transcription factors, labeled TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH, assemble with RNA polymerase II on the promoter to form what’s called the preinitiation complex. The process begins when TFIID recognizes and binds the TATA box. Then TFIIA joins, followed by TFIIB, which physically contacts both TFIID and the polymerase and helps position the enzyme at the correct start site. The remaining factors load in sequence, completing the complex before transcription can begin. Without this full assembly, RNA polymerase II cannot initiate on its own.
Ribonucleotides: The Raw Materials
The actual building blocks of the RNA chain are four ribonucleoside triphosphates: ATP, GTP, CTP, and UTP. Each carries three phosphate groups, and the energy stored in those phosphates powers the chemical reaction that adds each nucleotide to the growing chain. As each nucleotide is incorporated, two of its three phosphates are released, driving the reaction forward.
Cells maintain these four molecules at very different concentrations. In active immune cells, for example, ATP is by far the most abundant at roughly 6,700 micromolar, while CTP is the scarcest at around 182 micromolar. These concentration differences can influence how quickly transcription proceeds, particularly in cells with low metabolic activity.
Metal Ions That Activate the Chemistry
RNA polymerase cannot catalyze RNA synthesis without divalent metal ions, meaning metal atoms carrying a double positive charge. Manganese and magnesium are the two that most commonly fill this role, sitting in the enzyme’s active site and helping it join nucleotides together. In lab experiments, eukaryotic RNA polymerase II is actually more active with manganese than magnesium, and iron can also serve as an activator. On the other hand, zinc, mercury, and cadmium strongly inhibit the enzyme. Zinc in particular competes directly with the nucleotide substrates for space in the active site, effectively shutting down RNA synthesis.
DNA Unwinding During Elongation
For RNA polymerase to read the template, it has to pry apart the two strands of the DNA double helix. It does this by creating a small unwound region called a transcription bubble. During the early stages of transcription, this bubble expands until about 18 base pairs are separated and the new RNA is at least 7 nucleotides long. At that point, the upstream portion of the bubble (roughly 8 base pairs) snaps back together, and the bubble maintains a relatively constant size as it moves along the gene. The DNA behind the polymerase re-forms its double helix, while the DNA ahead is continuously unwound.
Termination Signals That End the Process
Transcription doesn’t run forever. Specific signals tell RNA polymerase to stop and release the new RNA.
Bacteria use two main strategies. In the first, called intrinsic termination, the newly made RNA folds into a hairpin-shaped loop followed by a string of uracil residues. This structure physically destabilizes the connection between the RNA and the polymerase, causing the enzyme to fall off the DNA. In the second strategy, a protein called Rho binds to the growing RNA chain at specific cytosine-rich sequences (called rut sites) at least 60 nucleotides long. Rho then chases down the polymerase and, using its ability to unwind RNA-DNA connections, forces the transcript to release.
Eukaryotic termination is more complex and varies by polymerase type, but the principle is the same: a combination of RNA sequences and protein factors signals the enzyme to disengage from the template and release the finished transcript.

