What Happens When CpG Islands Are Unmethylated?

When CpG islands are unmethylated, genes are available for transcription. This is the default state for most CpG islands in healthy cells: roughly 95% of those located at gene promoters remain unmethylated throughout normal development. That unmethylated status acts as a green light, allowing the cellular machinery to access DNA and read genetic instructions. When methylation is added to these regions, it functions like a lock, silencing gene activity.

How Unmethylated CpG Islands Enable Gene Expression

CpG islands are short stretches of DNA with an unusually high concentration of cytosine-guanine pairs. They sit at the promoters of about 70% of all human genes. In their unmethylated state, these islands are recognized by specialized proteins that contain a structural feature called a ZF-CxxC domain. These proteins bind specifically to non-methylated CpG sequences and tend to cluster just downstream of the point where transcription begins, right where CpG density is highest.

Once bound, these proteins recruit a complex (called SET1) that adds a chemical tag to nearby histone proteins, the spools around which DNA is wound. This tag, a trimethyl group on a specific spot of the histone known as H3K4, keeps the chromatin in an open, accessible configuration. Open chromatin is essential for the transcription machinery to physically reach the DNA and begin copying it into RNA. In this way, unmethylated CpG islands don’t just passively allow transcription. They actively shape the local environment to promote it.

Recent work published in Nature Communications revealed another layer to this system. The protein complex that reads unmethylated CpG islands also protects active genes from being prematurely shut down. Without this protection, a termination complex called ZC3H4/WDR82 would cut RNA transcripts short before they’re finished, effectively silencing the gene even though it was technically “on.” Unmethylated CpG islands, then, serve a dual role: they help initiate transcription and ensure it runs to completion.

Transcription Factors Need Unmethylated DNA

Many transcription factors, the proteins that switch specific genes on or off, can only bind to unmethylated DNA. When methyl groups are added to CpG sites within their binding sequences, the shape and chemistry of the DNA changes enough to block recognition. This isn’t a subtle effect. In experiments where all three DNA methyltransferases were deleted from mouse embryonic stem cells, thousands of new transcription factor binding sites appeared at locations that had previously been methylated and inaccessible.

One well-studied example is NRF1, a transcription factor that exclusively binds to regions with low methylation. In normal cells, NRF1 is completely absent from heavily methylated regions. When methylation was experimentally removed, NRF1 moved in and activated genes at those newly opened sites. Another example is CTCF, a protein critical for controlling which genes are read from which copy of a chromosome (a process called imprinting). CTCF binds only to the unmethylated copy of a control region, blocking nearby genes from being activated. On the methylated copy, CTCF can’t attach, so those genes turn on. This elegant system allows the same cell to treat its maternal and paternal gene copies differently, all based on methylation status.

Housekeeping Genes Depend on Permanent Unmethylation

Your cells contain a large category of genes that must stay active in every tissue, at every stage of life. These housekeeping genes handle fundamental tasks: building proteins, processing RNA, copying DNA, and maintaining basic cellular metabolism. Their promoters are rich in CpG sites that remain unmethylated across all cell types and tissues. This isn’t coincidental. Analysis of methylation data from dozens of healthy human cell types confirms that housekeeping gene promoters sit within dense clusters of permanently unmethylated CpGs, a pattern far too consistent to occur by chance.

Over 50% of genes with unmethylated promoters in embryonic stem cells are involved in transcription machinery, protein production, RNA processing, and other survival-critical functions. The unmethylated state of their promoters is one of the most reliable indicators that a gene will be actively expressed.

The Role in Embryonic Development

In early embryos, unmethylated CpG islands help maintain pluripotency, the ability of stem cells to become any cell type. Pluripotency genes and housekeeping genes both fall into the unmethylated category in embryonic stem cells. But the system is more nuanced than a simple on/off switch. Some developmental genes have promoters that carry both activating marks (H3K4 trimethylation) and repressive marks (H3K27 trimethylation) simultaneously. This “bivalent” state keeps the gene silent but poised, ready to activate quickly when the cell receives a signal to specialize into a particular tissue. The CpG island itself remains unmethylated during this poised state, preserving the option to turn the gene on later.

Genes that need to be permanently silenced during differentiation, such as those belonging to a different cell lineage, can eventually acquire DNA methylation at their CpG islands. But this is a more committed, harder-to-reverse form of silencing than histone modifications alone.

What Happens When CpG Islands Gain Methylation in Cancer

In healthy cells, promoter CpG islands are overwhelmingly unmethylated. Cancer disrupts this pattern. A hallmark of many tumors is hypermethylation, the inappropriate addition of methyl groups to CpG islands that should be open. This silences tumor suppressor genes, the genes responsible for slowing cell growth, repairing DNA damage, and triggering cell death when something goes wrong. With those safeguards turned off, cells can proliferate unchecked.

Methylome sequencing of prostate and breast cancers has shown that this silencing often begins at the edges of CpG islands and creeps inward, a process researchers call methylation encroachment. In normal cells, a specific histone mark (H3K4 monomethylation) at CpG island borders acts as a boundary, keeping methylation out. In cancer, those boundaries erode. The pattern of border marks in normal tissue can actually predict which CpG islands are most vulnerable to hypermethylation in tumors.

Intragenic CpG islands, those located within gene bodies rather than at promoters, are more susceptible to methylation even in healthy cells. About 65% of intragenic CpG islands carry significant methylation in normal tissue, compared to only about 5% of promoter CpG islands. This distinction matters because intragenic methylation doesn’t silence genes the same way promoter methylation does. It may instead help regulate alternative versions of a gene or suppress internal start sites.

Restoring the Unmethylated State With Treatment

Because abnormal CpG island methylation drives gene silencing in cancer, drugs that strip away methyl groups have become a therapeutic strategy. DNA methyltransferase inhibitors work by mimicking cytosine, one of the four DNA bases. During cell division, these drug molecules get incorporated into new DNA strands in place of real cytosine. When the cell’s methylation enzymes try to add methyl groups to these impostor bases, they get trapped and degraded. Over successive rounds of cell division, methylation is progressively lost from CpG islands that had been improperly silenced.

The result is reactivation of tumor suppressor genes. Two such drugs are FDA-approved and have shown the clearest benefit in blood cancers. In solid tumors, translating this approach has been more difficult, partly because the drug needs multiple rounds of cell division to work and solid tumors are harder to reach. Researchers are also exploring combinations with other therapies, since demethylation can make tumor cells more visible to the immune system.

Conservation Across 450 Million Years

The system of unmethylated CpG islands is not unique to humans. Studies profiling non-methylated DNA across seven vertebrate species found that the core properties of this system are deeply conserved. Unmethylated islands appear at the same genes across species separated by more than 450 million years of evolution, from fish to mammals. This level of conservation suggests that the epigenetic function of keeping CpG islands free from methylation is fundamental to how vertebrates regulate their genomes, not a recent adaptation but an ancient strategy for distinguishing active regulatory elements from the vast methylated background of vertebrate DNA.