What Is a Cryptic Splice Site and How Does It Work?

A cryptic splice site is a sequence in your DNA that looks like a normal splice site but stays silent under ordinary conditions. It only gets used when a mutation disrupts the real (canonical) splice site nearby, forcing the cell’s splicing machinery to latch onto the cryptic one instead. The result is an incorrectly processed messenger RNA, which often produces a defective protein or no functional protein at all.

To understand why this matters, it helps to know how genes normally work. Your genes contain coding regions (exons) interrupted by non-coding stretches (introns). Before a gene’s instructions can be used to build a protein, the cell has to cut out the introns and stitch the exons together. It finds the right cutting points by recognizing short sequence patterns at the boundaries, most commonly the letters GT at the start of an intron and AG at the end. Cryptic splice sites match these same patterns closely enough to function as cutting points, but they sit in locations the cell would normally ignore.

Why Cryptic Sites Stay Silent

Your genome contains thousands of sequences that resemble splice sites. The reason the cell picks the correct ones comes down to competitive strength. Authentic splice sites match the ideal consensus sequence more closely than cryptic ones do, giving them a measurable advantage in attracting the molecular machinery (called the spliceosome) that carries out the cutting and stitching. Quantitative scoring of splice site strength shows a clear hierarchy: authentic sites score highest, cryptic sites score in the middle, and mutated versions of authentic sites score lowest.

Beyond raw sequence strength, proteins and small RNA molecules that bind to the surrounding pre-mRNA also help guide the spliceosome to the correct location. These specificity factors can be critical for distinguishing authentic splice sites from cryptic or “pseudo” splice sites that happen to sit nearby. Together, sequence strength and these helper molecules keep cryptic sites suppressed in healthy cells.

What Activates a Cryptic Splice Site

The most common trigger is a mutation that weakens or destroys the canonical splice site. When the authentic GT or AG dinucleotide at an intron boundary is altered, the spliceosome can no longer recognize that location efficiently. It then defaults to the next-best option, which is often a nearby cryptic site. This can happen through several types of mutations:

Point mutations at splice junctions. A single-letter change in the GT or AG dinucleotide that defines the start or end of an intron is the most frequent cause. With the authentic signal gone, the spliceosome selects a cryptic site instead.
Deep intronic mutations. A change buried hundreds or thousands of base pairs inside an intron can create a brand-new splice site or activate a previously silent one. These are easy to miss in standard genetic testing because they fall far from the exon boundaries where labs typically look.
Insertions and deletions. Small structural changes in repetitive DNA sequences represent an underappreciated source of cryptic splice site activation. These variants can also be difficult to detect with routine screening methods.
Mutations in regulatory elements. Some mutations don’t touch a splice site directly but instead disrupt nearby sequences that normally suppress cryptic sites, effectively removing the brakes.

What Happens to the Protein

When a cryptic splice site is activated, the messenger RNA that gets produced is abnormal. It may include extra intronic sequence that shouldn’t be there (called a “pseudoexon”), or it may skip an exon entirely. Either way, the reading frame of the genetic code is usually disrupted.

A shifted reading frame typically generates a premature stop signal in the mRNA. The cell has a quality-control system called nonsense-mediated decay that recognizes most of these faulty messages and destroys them before they can be translated into protein. The result is little or no protein produced from that gene. In cases where the premature stop falls in the last exon of the gene, however, the surveillance system fails to catch it. The mRNA survives and gets translated into a truncated, often nonfunctional protein that can sometimes be toxic to the cell.

Cryptic Splicing in Cancer

Cryptic splice site activation plays a meaningful role in inherited cancer predisposition. A large study of splice-altering mutations examined 120 candidate mutations across 16 tumor suppressor genes in 150 families, including BRCA1, BRCA2, ATM, PALB2, and others like TP53 and APC.

The mechanisms can be surprisingly complex. In one family, a single mutation at the boundary of BRCA1 exon 15 weakened the canonical splice site enough to activate a cryptic acceptor site within the exon itself, producing two different abnormal transcripts in varying proportions. In another case, a two-letter deletion deep inside an intron of the APC gene (a colon cancer gene) disrupted a regulatory element that had been keeping a cryptic site quiet. With the silencer gone, two cryptic splice sites activated simultaneously, pulling 165 base pairs of intronic sequence into the mRNA and creating a premature stop codon. A mutation in the MLH1 gene (linked to Lynch syndrome) created a new splice site that in turn activated two previously silent acceptor sites, inserting entirely new exons that don’t appear in normal transcripts. One MSH2 mutation destroyed a splicing branch point and activated multiple cryptic sites, including one located more than 30,000 base pairs away.

Beta-Thalassemia as a Classic Example

Beta-thalassemia, a blood disorder caused by reduced production of hemoglobin, is one of the best-studied examples of cryptic splice site disease. More than 500 mutations can cause beta-thalassemia, and splicing defects are the most common category. One of the most prevalent mutations in Han Chinese populations is a single C-to-T change at position 654 in the second intron of the beta-globin gene. This point mutation creates a new donor splice sequence, which in turn activates a cryptic splice site at position 579. The result is 73 extra base pairs of intronic sequence jammed between exons 2 and 3, producing a non-functional mRNA and severely reducing hemoglobin production.

Predicting Cryptic Splice Sites

Identifying cryptic splice sites before they cause problems is a growing area of clinical genetics, especially as whole-genome sequencing becomes more common and reveals variants in deep intronic regions that were previously untested. Several computational tools exist to predict whether a given mutation will activate a cryptic site.

A head-to-head comparison of eight prediction tools found that SpliceAI, a deep-learning algorithm, substantially outperformed the others for cryptic splice site prediction. It achieved an area under the curve of 0.972 and a sensitivity of 95.7% at its recommended score threshold. SpliceRover came in second with 76.9% sensitivity, while older tools like MaxEntScan, NNSplice, and Human Splicing Finder all fell below 70%. The gap matters clinically: a tool that misses 30% or more of cryptic splice events will leave many patients without a molecular diagnosis.

Therapeutic Approaches

One of the most promising strategies for treating diseases caused by cryptic splicing uses short synthetic molecules called antisense oligonucleotides (ASOs). These are designed to bind directly to the cryptic splice site on the pre-mRNA, physically blocking the spliceosome from using it and redirecting splicing back to the correct location. The approach is particularly well suited to the many inherited metabolic diseases caused by a single gene defect, where restoring even partial normal splicing can be enough to produce functional protein.

The landmark case is milasen, a 22-letter ASO created for a single patient with Batten disease, a fatal neurodegenerative condition. In this patient, a genetic insertion in the MFSD8 gene introduced a cryptic splice-acceptor site in intron 6, leading to inclusion of a pseudoexon and premature termination of the protein. Milasen was designed to block that specific cryptic site and the adjacent regulatory sequence that was helping to activate it. The FDA permitted its use in 2018 as the first “N-of-1” investigational treatment of its kind. Similar ASO strategies are now in development for Pompe disease, Fabry disease, Hunter syndrome, and more than a dozen other inherited metabolic conditions.

Gene editing offers another angle. In beta-thalassemia mouse models, researchers have used editing tools to directly modify the cryptic splice site sequence in the DNA itself, permanently preventing its activation and restoring normal RNA splicing of the beta-globin gene. This approach targets the root cause at the genomic level rather than intercepting the problem at the RNA stage.