CpG DNA refers to locations in a DNA strand where a cytosine nucleotide sits directly next to a guanine nucleotide, connected by a phosphate bond. The “C” stands for cytosine, the “p” for the phosphate linkage, and the “G” for guanine. These two-letter sequences are small, but they play an outsized role in how genes get switched on or off, how your immune system detects infections, and how diseases like cancer develop.
The Basic Structure of a CpG Site
DNA is built from four chemical letters: adenine (A), thymine (T), cytosine (C), and guanine (G). A CpG site is simply a spot where C appears immediately before G when you read the DNA strand in its natural direction (called 5′ to 3′). The phosphate group between them is just the normal backbone connector that links all DNA letters together, but scientists include the “p” in the name to clarify that the C and G are neighbors on the same strand, not paired across the double helix.
What makes CpG sites biologically special is that the cytosine at these locations can be chemically modified. Enzymes called DNA methyltransferases attach a small chemical tag, a methyl group, to the cytosine’s ring structure. This converts regular cytosine into 5-methylcytosine. That tiny addition changes how the surrounding DNA behaves without altering the genetic code itself.
CpG Islands and Gene Control
CpG sites aren’t spread evenly across the genome. They tend to cluster in dense patches called CpG islands, which sit near the starting points of genes. A stretch of DNA qualifies as a CpG island when it meets three criteria: it spans at least 200 base pairs in length, more than half its letters are C or G, and CpG pairs appear at least 60% as often as you’d statistically predict.
More than 50% of human genes have a CpG island at their promoter, the region that controls whether the gene is active. When the cytosines in a CpG island remain untagged (unmethylated), the gene is generally free to produce its protein. When methylation tags accumulate across the island, they act like a physical barricade, preventing the cellular machinery from reading the gene. This is one of the body’s primary tools for controlling which genes are active in which cells, and it’s a core mechanism of epigenetics, the system that regulates gene activity without changing the DNA sequence.
Why Vertebrate Genomes Have Fewer CpG Sites
If CpG dinucleotides appeared purely by chance, you’d expect them to show up about as often as any other two-letter combination. In vertebrates, including humans, they’re significantly rarer than predicted. The reason traces back to methylation itself. Methylated cytosine is chemically unstable and tends to spontaneously convert into thymine over evolutionary time. Because most CpG sites outside of CpG islands are methylated, millions of years of this quiet mutation have gradually erased CpG pairs from much of the genome.
Invertebrate genomes, which use less CpG methylation, don’t show the same depletion. Bacterial and plant genomes vary widely, with some species having more CpG sites than expected and others fewer. This difference between vertebrate and microbial DNA turns out to be immunologically important.
How Your Immune System Uses CpG DNA
Your immune system exploits the scarcity of unmethylated CpG in human DNA as a way to detect foreign invaders. Bacteria and viruses have far more unmethylated CpG motifs in their genomes than human cells do. A receptor called TLR9, found inside certain immune cells, recognizes these unmethylated CpG sequences and triggers an alarm.
When TLR9 encounters DNA rich in unmethylated CpG, it activates a cascade that ramps up the innate immune response: the body’s rapid, first-line defense. Interestingly, TLR9 itself doesn’t appear to be very selective about which DNA it physically binds. Instead, the discrimination seems to happen through additional recognition steps and through where in the cell the DNA is encountered. The practical result is the same: microbial DNA sets off the alarm, while your own heavily methylated DNA generally does not.
CpG Methylation and Cancer
In healthy cells, methylation patterns are carefully maintained. Cancer disrupts this balance. A pattern called the CpG island methylator phenotype, or CIMP, describes tumors where CpG islands across many gene promoters become abnormally methylated. This widespread methylation silences tumor suppressor genes, the genes whose job is to keep cell growth in check. With those brakes disabled, cells can divide uncontrollably.
CIMP was first characterized in colorectal cancer, where it defines a distinct subgroup with different clinical behavior and outcomes compared to other colorectal tumors. The same methylation pattern has since been identified in gastric, lung, liver, ovarian, breast, and endometrial cancers, as well as glioblastomas and certain leukemias. Recognizing CIMP in a tumor can influence how aggressively it’s treated and what therapies are chosen, because these epigenetically driven cancers sometimes respond differently than cancers caused by direct DNA mutations.
Synthetic CpG in Vaccines and Immunotherapy
Because unmethylated CpG DNA naturally activates the immune system, scientists have designed synthetic CpG sequences, called CpG oligodeoxynucleotides (CpG-ODN), to harness that effect on purpose. The most prominent real-world application is in Heplisav-B, an FDA-approved hepatitis B vaccine. It contains a CpG-based adjuvant called CpG 1018 that activates TLR9 in a type of immune cell known as a plasmacytoid dendritic cell. These cells then mature into antigen-presenting cells that train the rest of the immune system to recognize the hepatitis B virus, producing a stronger and faster antibody response than older hepatitis B vaccines that use aluminum-based adjuvants.
Three classes of synthetic CpG-ODN exist, each triggering a different immune profile. Class A strongly stimulates the production of interferon-alpha, a powerful antiviral signaling molecule, and activates natural killer cells. Class B primarily activates B cells, the immune cells that produce antibodies, but generates little interferon. Class C combines both properties, making it a potent stimulator of interferon production, natural killer cell activation, and direct B cell stimulation. Researchers select the class based on what kind of immune response they want to provoke.
CpG in Cancer Treatment
CpG-ODN compounds have also been tested as cancer immunotherapies, with the goal of waking up the immune system to attack tumors. Used alone, they’ve shown limited effectiveness in clinical trials. The more promising approach combines CpG-ODN with other treatments: cancer vaccines, radiation, chemotherapy, or checkpoint inhibitors. These combinations are being tested in early-phase human trials, with the rationale that CpG activation reshapes the immune environment around tumors, making them more visible to the immune system while other therapies deliver the direct attack.

