Which Amino Acids Can Be Glycosylated: N, O, C & S Links

At least six amino acids can be glycosylated: asparagine, serine, threonine, tryptophan, cysteine, and tyrosine. A seventh target, hydroxylysine (a modified form of lysine), is glycosylated in nearly all collagens across the animal kingdom. Each amino acid uses a different chemical linkage and serves distinct biological purposes, so the type of glycosylation matters as much as the site itself.

Asparagine: The N-Linked Glycosylation Site

Asparagine is the target for N-linked glycosylation, the most well-characterized and abundant form of protein glycosylation in eukaryotes. The sugar chain attaches to the amide side chain of asparagine, but only when that asparagine sits within a specific three-amino-acid sequence called a sequon: Asn-X-Ser or Asn-X-Thr, where X can be almost any amino acid except proline.

Not all sequons are created equal. When serine is the third residue, certain amino acids in the middle position sharply reduce glycosylation efficiency. Sequons with tryptophan, aspartate, glutamate, or leucine in the X position are poor targets when paired with serine. Interestingly, when threonine occupies the third position instead, the same sequons are glycosylated efficiently. This means that having the right sequence motif in a protein doesn’t guarantee glycosylation will actually happen.

N-linked glycosylation begins while a protein is still being built by the ribosome and threaded into the endoplasmic reticulum. A large, pre-assembled sugar tree is transferred onto the asparagine all at once, then trimmed and remodeled as the protein matures. These glycans can become highly branched, complex structures that help stabilize protein shape, protect against degradation, and regulate interactions between cells. The quality control machinery of the cell also reads these sugar structures: chaperone proteins recognize specific sugar patterns on newly made glycoproteins and help refold any that haven’t folded correctly.

Serine and Threonine: The O-Linked Glycosylation Sites

Serine and threonine are the two primary targets for O-linked glycosylation, where sugars attach to the hydroxyl group on each amino acid’s side chain. Unlike N-linked glycosylation, there is no strict consensus sequence. Instead, the local protein environment and the enzymes present in a given cell type determine which serines and threonines get modified.

The most common version is mucin-type O-glycosylation, where N-acetylgalactosamine (GalNAc) is the first sugar attached. This initial sugar can then be extended with galactose, N-acetylglucosamine, fucose, or sialic acid, building out increasingly complex chains. The simplest possible mucin O-glycan is a single GalNAc on a serine or threonine, known as the Tn antigen. The most common extension is a core 1 structure, where galactose is added to that initial GalNAc.

Another important form is O-GlcNAcylation, where a single N-acetylglucosamine is added to serine or threonine residues on proteins inside the cell (as opposed to mucin glycosylation, which happens on secreted or membrane proteins). O-GlcNAcylation acts more like a regulatory switch, cycling on and off in response to nutrient levels and cellular signals.

Tryptophan: The C-Linked Glycosylation Site

Tryptophan is unique among glycosylated amino acids because it forms a direct carbon-to-carbon bond with mannose, a type of linkage called C-mannosylation. This is chemically distinct from the nitrogen-linked or oxygen-linked bonds seen at other sites. The mannose attaches to a specific carbon on tryptophan’s ring structure.

C-mannosylation typically occurs at a WXXW sequence motif, where W is tryptophan and X is any amino acid. The first tryptophan in this motif, and sometimes the second, gets modified. A conserved PXP sequence downstream of the site also plays a role. Unlike the bulky sugar trees found in N-linked and O-linked glycosylation, C-mannosylation adds just a single mannose residue. This small sugar fits closely into the protein’s folds, where it reduces local hydrophobicity and enables new interactions with nearby charged amino acids, particularly arginine. The net effect is to physically alter the protein’s shape and increase its stability.

Cysteine: The S-Linked Glycosylation Site

Cysteine can be glycosylated through its sulfur-containing side chain, a modification called S-glycosylation. This is the rarest and most recently appreciated form. The first reports appeared in 1971, when sugar-linked cysteine peptides were isolated from human urine and red blood cells. Definitive proof came in 1998, when researchers confirmed that a specific cysteine in the human inter-alpha-trypsin inhibitor complex carries a sugar modification.

S-glycosylation has since been found in bacterial antimicrobial peptides called glycocins, where glucose or N-acetylglucosamine is attached to cysteine residues. Sublancin, Glycocin F, and Thurandacin A and B are among the known examples. More surprisingly, experiments on mouse and human cell proteins in 2016 and 2017 revealed that cysteine S-glycosylation by N-acetylglucosamine co-occurs alongside the well-known O-GlcNAcylation of serine and threonine, suggesting this modification may be more widespread than previously thought.

Tyrosine: Glycogenin’s Self-Modification

Tyrosine glycosylation is best known from a single, highly specific context: the protein glycogenin, which kick-starts glycogen synthesis. Glycogenin attaches glucose to its own tyrosine 195 through a process called auto-glucosylation, building a short primer chain of 8 to 12 glucose units connected by alpha-1,4 linkages. Glycogen synthase then takes over to extend this chain into the large glycogen granules your cells use for energy storage. When researchers mutated tyrosine 195 to phenylalanine (a structurally similar amino acid that lacks the hydroxyl group), glycogenin could no longer glucosylate itself, confirming that this specific tyrosine is essential for the reaction.

Hydroxylysine: A Modified Amino Acid in Collagen

Hydroxylysine isn’t one of the standard 20 amino acids, but it deserves mention because it’s one of the most biologically important glycosylation targets. Lysine residues in collagen are first hydroxylated (an oxygen-containing group is added), and the resulting hydroxylysine is then glycosylated. Nearly all collagens in the animal kingdom carry glycosylated hydroxylysine residues.

The modification happens in two steps. First, galactose is attached to hydroxylysine. Then glucose can be added on top of the galactose, creating a two-sugar unit. This glycosylation must occur before the collagen folds into its characteristic triple helix, because the modification is physically blocked once that structure forms. Computational studies show that hydroxylysine glycosylation doesn’t disrupt the triple helix, but it does influence the local backbone shape and how water molecules interact with the collagen surface. The sugars orient themselves with their water-attracting faces pointed outward and their hydrophobic faces tucked against the protein. This modification also affects how collagen molecules assemble into the larger fibrils that give connective tissues their strength.

Glycosylation vs. Glycation

It’s worth distinguishing glycosylation from glycation, since both involve sugars attaching to amino acids but through completely different mechanisms. Glycosylation is enzyme-controlled and targets the amino acids described above. Glycation is a non-enzymatic, spontaneous reaction where reducing sugars (mainly glucose) react with lysine and arginine side chains on long-lived proteins. Glycation is generally damaging. It produces compounds called advanced glycation end products that accumulate with age and in conditions like diabetes, contributing to tissue stiffness and inflammation. The amino acid targets don’t overlap: glycosylation modifies asparagine, serine, threonine, tryptophan, cysteine, and tyrosine, while glycation targets lysine and arginine.