What Is a Cohort Study? Definition and Examples

A cohort study is a type of observational research that follows a group of people over time to see whether a specific exposure or characteristic leads to a particular health outcome. Researchers start by identifying people who share a common trait but differ in one key way, such as whether they smoke, then track both groups to compare how often a disease or outcome develops in each. It’s one of the most widely used designs in epidemiology and the backbone behind many of the health recommendations you encounter daily.

How a Cohort Study Works

The basic structure is straightforward. Researchers recruit a group of people (the “cohort”), divide them based on whether they have a certain exposure or risk factor, and then follow both groups forward in time. The exposed group and the unexposed comparison group are monitored using medical records, interviews, or clinical exams. Outcomes need to be defined in advance and must be specific and measurable.

The critical feature is direction: cohort studies move from exposure to outcome. Researchers start with people who don’t yet have the disease in question, then watch to see who develops it. This lets them calculate how much more likely the exposed group is to develop the outcome compared to the unexposed group. That calculation is called relative risk: the incidence of disease in the exposed group divided by the incidence in the comparison group. A relative risk of 2.0, for example, means the exposed group is twice as likely to develop the condition.

Prospective vs. Retrospective Designs

Cohort studies come in two main flavors, and the difference is timing. A prospective cohort study recruits participants in the present and follows them into the future. Researchers collect data in real time as events unfold, which gives them more control over what gets measured and how.

A retrospective cohort study (sometimes called a historic cohort study) works from existing records. Researchers identify a group defined by a past exposure and use medical records or databases to trace what happened to them afterward. The National Cancer Institute describes this as comparing groups “who are alike in many ways but differ by a certain characteristic” by looking at records for a particular outcome. Both designs follow the same logical structure, exposure to outcome, but the retrospective version is faster and cheaper because the data already exists. The trade-off is less control over data quality.

The Framingham Heart Study: A Classic Example

The most famous cohort study in history started in 1948 in Framingham, Massachusetts, when researchers enrolled 5,209 participants to investigate the causes of heart disease. That study is still running. It has since expanded to include the children and grandchildren of the original volunteers, producing decades of continuous health data.

Nearly every major cardiovascular risk factor you’ve heard of was identified or confirmed through Framingham. The study linked cigarette smoking to coronary heart disease, established that high blood pressure and high cholesterol raise heart disease risk, connected obesity and physical inactivity to cardiac events, and identified diabetes as a risk factor. The very term “risk factor” was popularized through Framingham’s findings. It’s a powerful illustration of what cohort studies do best: reveal long-term patterns between exposures and outcomes across large populations.

What Cohort Studies Do Well

Cohort studies have a few key advantages over other observational designs. Because they track people from exposure to outcome, they can establish a clear timeline, showing that the exposure came before the disease. This doesn’t prove causation on its own, but it’s a necessary ingredient. They also allow researchers to study multiple outcomes from a single exposure. A study tracking a group of factory workers exposed to a chemical, for instance, can simultaneously look at lung disease, cancer, skin conditions, and mortality.

They’re particularly efficient for studying rare exposures. If only a small fraction of the population is exposed to something (a specific occupational hazard, an unusual medication), a cohort study can deliberately recruit people with that exposure and follow them, rather than waiting for cases to appear in a general population.

Where Cohort Studies Fall Short

The biggest limitation is time. Prospective cohort studies can take years or decades to produce results, especially for diseases that develop slowly. That makes them expensive and logistically demanding. They also struggle with rare outcomes. If the disease you’re studying affects one in 10,000 people, you’d need an enormous cohort to observe enough cases to draw meaningful conclusions.

Participant dropout, known as attrition, is another persistent problem. Research suggests that losing up to 20% of participants over the study period may be acceptable without introducing serious bias. But when attrition falls between 20% and 40%, results can become significantly skewed, particularly when the people who drop out differ in meaningful ways from those who stay. Recommended follow-up rates for cohort studies range from 50% to 80%, though real-world rates vary widely, and some researchers argue that rigid cutoffs for “acceptable” follow-up can unnecessarily limit valuable research.

Cohort Studies vs. Case-Control Studies

These two designs are often confused, but they work in opposite directions. A cohort study starts with groups defined by their exposure and watches for outcomes. A case-control study starts with people who already have the outcome (cases) and a comparison group who don’t (controls), then looks backward to identify disproportionate exposures.

Case-control studies are the better choice for rare diseases because they start by selecting people who already have the condition. Cohort studies are the better choice for rare exposures and for studying multiple outcomes from a single risk factor. Case-control studies are also faster and less expensive, but they can’t directly calculate incidence rates or relative risk the way cohort studies can. Each design answers a slightly different question, and the choice depends on what’s being studied and the resources available.

Where Cohort Studies Sit in the Evidence Hierarchy

In the ranking of research evidence, cohort studies sit below randomized controlled trials but above case-control studies and case reports. They can’t match the rigor of a randomized trial because participants aren’t randomly assigned to their exposures. Someone who smokes and someone who doesn’t may differ in dozens of other ways (diet, income, access to healthcare), and those differences can muddy the results. Researchers use statistical methods to adjust for these confounding factors, but they can never fully eliminate them.

Still, cohort studies are often the strongest evidence available. You can’t ethically randomize people to smoke or to be exposed to toxic chemicals, so observational designs like cohort studies fill the gap. When multiple cohort studies point in the same direction, the cumulative evidence can be compelling enough to shape public health policy.

How Cohort Studies Are Evaluated

Not all cohort studies are created equal, and the medical community uses a standardized checklist called STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) to evaluate them. STROBE was developed by an international group of epidemiologists, statisticians, and journal editors to address a common problem: incomplete or unclear reporting that makes it hard for readers to assess a study’s strengths and weaknesses. The checklist specifies what a published cohort study should include, covering everything from how participants were selected to how results were analyzed. Major medical journals endorse it, and checking whether a study follows STROBE guidelines is a quick way to gauge its transparency.