What Type of Mutation Is Insertion or Deletion?

Insertions and deletions are their own class of mutation, distinct from substitutions (where one base simply swaps for another). Collectively called “indels,” they involve the addition or removal of one or more nucleotides in a DNA sequence. When an indel occurs in a protein-coding gene and the number of nucleotides isn’t a multiple of three, it creates a frameshift mutation, one of the most disruptive changes that can happen to a gene. When the indel is a multiple of three, it’s called an in-frame mutation, which adds or removes whole amino acids without disrupting the rest of the protein’s code.

How Indels Differ From Substitutions

The two major categories of small-scale DNA mutations are base substitutions and indels. A substitution swaps one nucleotide for another, like replacing an A with a G. An insertion or deletion, by contrast, changes the total length of the DNA sequence by adding or removing nucleotides. This difference matters because the two types arise through completely different molecular processes. Substitutions typically come from a wrong base being incorporated during DNA copying or from chemical damage that alters an existing base. Indels, on the other hand, generally arise from strand slippage during DNA replication or from breaks in both strands of the DNA helix.

After single nucleotide variants, small indels are the second most common form of genetic variation in humans. They range from a single nucleotide up to dozens of base pairs, though larger structural changes (involving thousands or millions of bases) are usually classified separately as chromosomal deletions or insertions.

Frameshift Mutations: The Most Common Outcome

Your cells read DNA in groups of three nucleotides called codons. Each codon specifies one amino acid in a protein. If you insert or delete one or two nucleotides, every codon downstream of that change gets shifted out of alignment. This is a frameshift mutation, and it’s devastating to the resulting protein.

Consider a simple analogy. If your DNA reads THE CAT ATE THE RAT, deleting the first “C” turns it into THE ATA TET HER AT. Every “word” after the change becomes nonsense. The same thing happens to codons: the ribosome (the cell’s protein-building machinery) reads completely wrong amino acids from the point of the indel onward. Worse, frameshifts almost always introduce a premature stop signal, cutting the protein short. When the cell detects this kind of truncated message, it often destroys the faulty RNA before a protein is even made, leaving you with little or no functional protein from that gene.

Inserting or deleting two bases causes the same problem. The reading frame shifts, the downstream amino acid sequence changes entirely, and a premature stop codon usually appears. Most indels in protein-coding regions produce frameshifts, which is why indels have such a direct effect on whether a gene works or not.

In-Frame Mutations: Adding or Removing Whole Amino Acids

The exception is when the insertion or deletion involves exactly three nucleotides (or a multiple of three). In this case, you gain or lose one or more complete codons, and the reading frame stays intact. The rest of the protein is translated normally. These are called in-frame indels.

That doesn’t mean in-frame indels are harmless. Adding or removing even a single amino acid can wreck a protein’s three-dimensional shape. Proteins fold into specific structures, including tightly wound helices and flat sheets held together by precise networks of chemical bonds. Deleting one amino acid from a helix, for example, rotates the positions of neighboring amino acids, potentially pushing water-repelling parts of the protein to the outside where they don’t belong. This can cause the protein to misfold or clump together.

The most famous in-frame deletion in medicine is the F508del mutation that causes cystic fibrosis. It’s a deletion of exactly three base pairs, removing a single amino acid (phenylalanine) at position 508 of the CFTR protein. Even though only one amino acid is missing, the consequences are severe: the protein misfolds so badly that the cell’s quality control system recognizes it as defective and destroys it before it ever reaches the cell surface. The small fraction that does escape degradation has a shorter lifespan than normal and doesn’t open its chloride channel properly. This single three-base deletion accounts for roughly 70% of cystic fibrosis cases worldwide.

How Indels Happen at the Molecular Level

The most common cause of small indels is DNA replication slippage. When the enzyme copying your DNA encounters a stretch of repetitive sequence (like ACACAC), the newly made strand can slip backward or forward along the template. If it slips backward, an extra repeat gets inserted. If it slips forward, a repeat is deleted. This is why repetitive sequences in your genome are mutation hotspots for indels.

A related process called indel slippage can produce very small insertions or deletions at random positions in the genome, even without obvious repetitive sequences. Double-strand breaks in DNA, when repaired imprecisely, can also introduce indels. And during the exchange of genetic material between chromosomes (recombination), unequal crossing over can produce insertions on one chromosome and deletions on the other.

Repeat Expansions: A Special Category

Some diseases are caused by a particular type of insertion where a short sequence of nucleotides gets copied over and over, expanding a repeat region beyond its normal length. Huntington’s disease is the classic example. The HTT gene normally contains a stretch of CAG repeats. Most people have fewer than 36 repeats and are unaffected. At 36 to 39 repeats, the disease may or may not develop. At 40 or more repeats, Huntington’s disease will develop with full certainty during a person’s lifetime. The repeated CAG codes for the amino acid glutamine, so the expanded protein contains an abnormally long string of glutamines that causes it to misfold and damage brain cells.

These trinucleotide repeat expansions are technically large insertions, but they’re often categorized separately because their mechanism (progressive expansion across generations through replication slippage) and their effects (a protein that gains a toxic function rather than simply losing its normal one) are distinct from typical indels.

Large-Scale Deletions and Insertions

At the other end of the size spectrum, deletions and insertions can span thousands or even millions of base pairs, removing or duplicating entire genes or clusters of genes. These are classified as structural variants or chromosomal abnormalities rather than point mutations. Microdeletion syndromes involve the loss of a small chromosomal segment, typically less than 2 million base pairs, that’s too small to see on a standard chromosome analysis but large enough to remove several genes at once.

More than 20 microdeletion syndromes have been identified. DiGeorge syndrome results from a deletion on chromosome 22, Prader-Willi and Angelman syndromes from deletions on chromosome 15, and Williams syndrome from a deletion on chromosome 7. These conditions produce complex, multi-system effects because losing a stretch of chromosome means losing multiple unrelated genes simultaneously. A deletion on chromosome 15, for instance, has been linked to intellectual disability, epilepsy, autism spectrum disorder, speech delay, and bipolar disorder, depending on the specific genes affected and other genetic factors.

Quick Size Reference

  • 1-2 nucleotides: Almost always causes a frameshift in coding regions. Disrupts the entire downstream protein sequence.
  • 3 nucleotides (or multiples of 3): In-frame indel. Adds or removes whole amino acids without shifting the reading frame, though protein structure can still be severely affected.
  • Trinucleotide repeat expansions: Progressive insertions of three-base units. Cause diseases like Huntington’s when repeat counts cross specific thresholds.
  • Thousands to millions of bases: Classified as structural variants or microdeletions/microduplications. Can remove or duplicate entire genes.

The core principle across all these scales is the same: insertions and deletions change the length of a DNA sequence rather than swapping one base for another. Whether that change disrupts a reading frame, removes a critical amino acid, or deletes an entire gene depends on the size of the indel and where in the genome it lands.