What Are Introns? Definition, Role & Function in DNA

Introns are stretches of DNA within a gene that do not code for protein. When a gene is copied into a preliminary RNA transcript, introns are cut out and discarded before the final messenger RNA is assembled. The segments that remain and get stitched together are called exons, which carry the actual instructions for building a protein. In the human genome, introns make up the vast majority of gene length: on average, only about 10% of a gene’s sequence is exonic, meaning roughly 90% consists of intronic material.

How Introns Get Removed

The removal process, called splicing, is carried out by a large molecular machine known as the spliceosome. This complex is built from five small RNA molecules and more than 200 proteins that assemble on the preliminary RNA transcript. Splicing happens in two chemical steps. First, the intron loops back on itself and its tail end bonds to a specific point within its own sequence, forming a lasso-shaped loop called a lariat. Second, the two neighboring exons are joined together and the lariat is released. The result is a clean, continuous RNA message ready to be read by the cell’s protein-making machinery.

In organisms like yeast, which tend to have short introns and long exons, the spliceosome recognizes the intron directly and assembles across it. Vertebrates, including humans, work the opposite way: our exons are typically short and our introns are large, so the spliceosome first recognizes and assembles across an exon, then rearranges to cut out the flanking introns. This distinction matters because it influences how accurately the cell identifies splice sites, and errors at this step can cause disease.

Not All Introns Are the Same

Spliceosomal introns, the type found in human genes, are just one of four major classes. Group I introns and group II introns are both “self-splicing,” meaning the intron RNA itself acts as a catalyst and cuts itself out without needing a separate machine. Group II introns use a lariat mechanism very similar to spliceosomal splicing, which is a strong clue that the two are evolutionarily related. Group I introns work differently, using an external molecule as a chemical helper and producing a linear fragment rather than a lariat. A fourth class exists in transfer RNA genes and uses yet another removal strategy.

Group I and group II introns are found mostly in bacteria, mitochondria, and chloroplasts. Spliceosomal introns dominate in the nuclear genomes of complex organisms like plants, animals, and fungi.

Why Introns Exist

The discovery of introns in 1977 was one of the most unexpected findings in 20th-century biology. Almost immediately, scientists began debating why genes are split into pieces. Two competing ideas emerged. The “introns early” hypothesis proposes that introns existed in the earliest genes and helped assemble the first proteins by allowing small coding modules to be mixed and matched through recombination. Under this view, bacteria lost their introns over time as their genomes became streamlined for rapid reproduction. The “introns late” hypothesis counters that spliceosomal introns appeared only after eukaryotic cells evolved and have been accumulating ever since.

Current evidence supports a compromise. Spliceosomal introns likely evolved from group II self-splicing introns, which exist in small numbers in many bacteria. These group II introns probably entered the ancestral eukaryotic cell from the bacterium that eventually became the mitochondrion. Over hundreds of millions of years, they multiplied throughout the genome and lost their ability to self-splice, becoming dependent on the spliceosome instead.

How Introns Increase Protein Diversity

One of the most consequential things introns make possible is alternative splicing. Because exons are separate modules flanked by introns, the cell can mix and match which exons get included in the final RNA. A single gene can produce multiple different messenger RNAs by skipping certain exons, retaining certain introns, or choosing between alternative splice sites. Recent data show that the average human protein-coding gene contains about 11 exons and produces roughly 5.4 distinct messenger RNAs. Each of those can be translated into a slightly different protein variant with potentially different functions or tissue-specific roles.

This is a major reason why the human body can produce far more proteins than it has genes. With roughly 20,000 protein-coding genes generating multiple variants each, alternative splicing is one of the primary mechanisms behind the complexity of human biology.

Other Functions Beyond Splicing

For decades, introns were dismissed as “junk DNA,” but they turn out to play several active roles in gene regulation. Some introns contain enhancer or silencer sequences that influence whether and how strongly a gene gets turned on. These regulatory elements can modulate the gene’s own promoter, the stretch of DNA that signals the start of transcription.

Introns also participate in a quality-control system called nonsense-mediated decay. When splicing leaves behind molecular markers at exon junctions, the cell uses those markers to detect faulty RNA messages that contain premature stop signals. If a stop signal appears too early, the RNA is flagged and destroyed before it can produce a defective protein. This system was originally understood as an error-correction mechanism, but recent work suggests the cell also uses it deliberately to fine-tune normal gene expression levels.

When Intron Splicing Goes Wrong

Because splicing depends on precise recognition of short signal sequences at intron boundaries, even a single nucleotide change in those regions can disrupt the process. These splicing mutations can cause exons to be skipped, introns to be retained in the final RNA, or new splice sites to be created where none should exist. The result is an abnormal protein, or no functional protein at all.

Splicing mutations are a significant cause of inherited genetic disorders. They have been identified in genes linked to conditions like neurofibromatosis, cystic fibrosis, and various cancers. Some of these mutations sit right at the intron-exon boundary, while others, called deep intronic mutations, occur far within an intron and create false splice sites that trick the spliceosome. Advances in genetic sequencing have made these deep intronic variants easier to detect, revealing that splicing errors play a larger role in human disease than scientists initially appreciated.