What Is Cohesin? A Ring-Shaped Protein Complex

Cohesin is a protein complex that forms a ring-shaped structure around DNA. Its most well-known job is holding newly copied chromosomes together until a cell is ready to divide, but it also plays a critical role in organizing the three-dimensional structure of the genome and controlling which genes get turned on or off. Found in virtually all organisms from yeast to humans, cohesin is essential for life.

The Ring Structure

Cohesin is built from four core protein subunits that assemble into a loop. Two of these, called SMC1 and SMC3, form a V-shaped pair. A third protein, RAD21, bridges the open ends of that V to create a triangular ring. The fourth subunit (either SA1 or SA2 in humans) attaches to RAD21 and reinforces the structure. The result is a large protein ring with an interior wide enough to physically encircle strands of DNA.

This ring shape is central to how cohesin works. Rather than chemically bonding to DNA, cohesin appears to trap DNA inside itself, like a carabiner clipped around a rope. That topological embrace lets it hold two DNA strands together or slide along a single strand, depending on what the cell needs at a given moment.

Holding Sister Chromatids Together

Cohesin’s original discovery centered on its role in cell division. When a cell copies its DNA during the S phase of the cell cycle, it produces two identical sister chromatids that need to stay paired until the cell is ready to split. Cohesin accomplishes this by encircling both chromatids inside its ring, keeping them physically tethered.

This pairing is not permanent. When the cell reaches the transition from metaphase to anaphase (the moment chromosomes are pulled to opposite poles), an enzyme called separase cuts the RAD21 subunit of the ring. Separase is a specialized protein-cutting enzyme that recognizes a specific short sequence on RAD21 and slices it open. Once the ring is broken, the sister chromatids are free to separate, and cell division proceeds. The timing of this cleavage is tightly controlled: premature cutting leads to chromosome mis-segregation and potentially cancerous or nonviable daughter cells.

Loading and Removal From DNA

Cohesin doesn’t just appear on chromosomes. A dedicated loading complex, made up of the proteins NIPBL and MAU2, places cohesin onto DNA starting in early G1, the phase before DNA replication. Once loaded, cohesin can slide along the chromosome, but it doesn’t stay put indefinitely. A separate pair of proteins, PDS5 and WAPL, acts as a removal crew, stripping cohesin off chromosome arms when the cell enters mitosis.

This loading-and-removal cycle means cohesin is constantly turning over on chromosomes. The balance between loading by NIPBL and removal by WAPL determines how much cohesin is present on any given stretch of DNA at any given time, which in turn affects both chromosome cohesion and genome organization.

Organizing the 3D Genome

Beyond holding sister chromatids together, cohesin is one of the primary architects of how DNA is folded inside the nucleus. It does this through a process called loop extrusion: cohesin lands on a stretch of DNA and actively reels it through its ring, creating an expanding loop of DNA. This brings distant regions of a chromosome physically close to each other.

A DNA-binding protein called CTCF acts as a traffic signal for this process. CTCF sits at specific sites along the genome and stops cohesin from extruding further, creating defined loops with consistent boundaries. These loops form structures known as topologically associating domains, or TADs, which are fundamental units of genome organization. Genes within the same TAD tend to be regulated together, while genes in neighboring TADs are largely insulated from each other.

Recent research published in Nature has revealed that CTCF does more than simply block cohesin like a roadblock. It can change the direction of loop extrusion and even cause loops to shrink. The permeability of TAD boundaries also depends on the physical tension in the DNA strand, making the system more dynamic and responsive than scientists initially assumed.

Controlling Gene Expression

The loops cohesin creates have direct consequences for which genes a cell turns on. Genes are activated when regulatory regions called enhancers come into physical contact with gene promoters. Cohesin-mediated loops can bring an enhancer and its target promoter together across large genomic distances, essentially connecting a switch to the light it controls.

There are two major classes of these loops. The first type has CTCF anchored at both ends and primarily functions as an insulating boundary. The second type connects active enhancers directly to promoters and often lacks CTCF entirely. At these CTCF-free sites, common transcription factors and chromatin-modifying proteins help stabilize cohesin in place, keeping the enhancer-promoter connection intact.

Interestingly, experiments that rapidly remove cohesin or CTCF from cells eliminate DNA loops across the genome but have surprisingly modest effects on gene activity in the short term. This suggests that while cohesin-mediated loops are important for setting up and maintaining proper gene regulation, cells have some buffering capacity that can sustain transcription temporarily even when the loops disappear.

Specialized Roles in Meiosis

During meiosis, the type of cell division that produces eggs and sperm, cells use specialized versions of cohesin subunits. In mammals, the standard RAD21 is largely replaced by a protein called REC8, SMC1A is swapped for SMC1B, and SA1/SA2 are replaced by STAG3. These meiotic versions handle the unique demands of meiosis, including pairing homologous chromosomes (not just sister chromatids) and managing the two rounds of division needed to halve the chromosome number.

SMC1B, originally thought to function only in meiosis, has since been found in non-meiotic cells as well, where it participates in a mitotic cohesin complex. This overlap hints that the boundary between “meiotic” and “mitotic” cohesin may be less rigid than initially believed.

What Happens When Cohesin Goes Wrong

Because cohesin is involved in so many fundamental cellular processes, mutations in its subunits or regulatory proteins cause serious developmental disorders collectively known as cohesinopathies. The best-characterized of these is Cornelia de Lange syndrome, which results from mutations in any of at least seven genes that affect the cohesin complex. People with this condition typically have distinctive facial features, growth delays, and limb malformations, though the severity varies widely.

Cornelia de Lange syndrome is often described as a “transcriptomopathy,” meaning the symptoms arise not from failures in chromosome segregation but from widespread disruptions in gene expression during development. This underscores how central cohesin’s genome-organizing role is to normal human biology. Cohesin mutations have also been linked to certain cancers, where faulty chromosome segregation or altered gene regulation can drive tumor growth.