What Is an Estimand in Clinical Trials and Why It Matters?

An estimand is a precise description of the treatment effect that a clinical trial is trying to measure. It defines, before any data is collected, exactly what question the trial aims to answer about how a treatment works in a specific group of patients. The concept was formalized in an international regulatory guideline (ICH E9 R1) that became effective in 2020, and it has since become a required part of how clinical trials are designed and evaluated by agencies like the FDA and EMA.

The reason the term exists is surprisingly practical: clinical trials often produce results that different people interpret differently, because no one spelled out in advance what “treatment effect” actually meant. A patient might stop taking the drug, switch to a different medication, or need rescue therapy. What happens to their data? The estimand framework forces everyone involved to answer that question up front.

Estimand, Estimator, and Estimate

These three terms sound similar but refer to different things. The estimand is the target quantity: what the study aspires to measure. Think of it as the bullseye on the dartboard. An estimator is the statistical formula or algorithm used to calculate that quantity from trial data, like the difference in average outcomes between two groups. The estimate is the actual number you get when you apply that formula to the collected data.

A trial measuring blood sugar reduction, for instance, might define its estimand as the difference in average blood sugar levels between the drug group and the placebo group at 6 months. The estimator would be the specific statistical method used to calculate that difference. The estimate would be the final number: say, a 0.8% reduction.

The Five Attributes of an Estimand

The ICH E9 R1 guideline specifies five components that together define an estimand. Each one answers a different part of the question “what are we measuring?”

Population: Which patients is the trial asking about? This could be the entire trial population, a subgroup defined by some baseline characteristic (like patients over 65), or a narrower group defined by whether a specific event occurred during the trial.
Variable (endpoint): What outcome is being measured for each patient? This might be blood pressure at 12 weeks, survival time, pain score, or any other measurable result.
Intercurrent events: What events might happen after treatment starts that could complicate the measurement? These are things like a patient discontinuing the drug due to side effects, needing rescue medication, or undergoing surgery. Identifying these in advance is a core purpose of the framework.
How intercurrent events are handled: For each intercurrent event, the trial specifies a strategy that determines how that event affects the treatment effect being measured. This is where most of the complexity lives.
Population-level summary: How will outcomes be compared between treatment groups? This could be a difference in means, a ratio of event rates, a hazard ratio, or another summary measure.

What Intercurrent Events Are

Intercurrent events are one of the most important concepts in the framework because they are the reason estimands were formalized in the first place. These are events that occur after a patient starts treatment and that affect either how you interpret the outcome measurement or whether the measurement even exists.

Common examples include a patient stopping the study drug because of side effects, starting a different medication (rescue therapy), undergoing a procedure like surgery that changes the trajectory of the disease, or dying before the endpoint can be measured. In a diabetes trial, for instance, a patient who starts insulin as rescue medication will likely have different blood sugar levels than they would have without it. That creates an ambiguity: does the trial want to know how the drug performs when patients also have access to rescue insulin, or how it would perform if rescue insulin weren’t available? These are genuinely different clinical questions, and they lead to different trial results.

Five Strategies for Handling Intercurrent Events

The guideline outlines five strategies for dealing with intercurrent events. Each one reframes the clinical question in a different way.

Treatment Policy

This strategy says: measure the outcome regardless of what happened. If a patient stopped the drug, switched medications, or had surgery, their outcome still counts as recorded. This approach mirrors the traditional intention-to-treat analysis. It answers the question “what is the effect of assigning this treatment in a real-world setting where all these complications naturally occur?”

Hypothetical

This strategy imagines a scenario where the intercurrent event didn’t happen. For example, in a diabetes trial where rescue insulin is available for ethical reasons, a hypothetical strategy might ask: “What would blood sugar levels look like if rescue medication hadn’t been available?” This is useful when the intercurrent event can realistically be intervened on. It makes less sense for events that can’t be prevented, like discontinuation due to severe side effects, because imagining a world where patients tolerate intolerable side effects isn’t clinically meaningful. One nuance: different hypothetical scenarios can be defined for the same event. A trial might ask what would happen if rescue medication were unavailable for the first 6 months of a 36-month study, rather than for the entire duration.

Composite

This strategy folds the intercurrent event directly into the outcome definition. Instead of measuring blood sugar separately from whether the patient needed rescue medication, the endpoint becomes “the proportion of patients who reached a target blood sugar level without using rescue medication.” In a nasal polyps trial, for instance, the composite endpoint was defined as “improvement in polyp score of at least 1 point and completion of the treatment period without surgery.” The intercurrent event (surgery) becomes part of what counts as success or failure.

While on Treatment

This strategy only counts outcomes measured while the patient is still on their assigned treatment. Once a patient discontinues or switches, their subsequent data is excluded. This answers the question “what is the effect of the drug for as long as patients actually take it?”

Principal Stratum

This strategy focuses on a subgroup of patients defined by whether they would or wouldn’t experience the intercurrent event. For example, it might target only patients who would complete the full treatment course regardless of which group they were assigned to. This is conceptually powerful but harder to implement, because you can’t directly observe which group a patient falls into.

Why the Framework Matters

Before the estimand framework, the same trial data could be analyzed in different ways that answered subtly different clinical questions, often without anyone acknowledging that the questions were different. A drug company might report results using one approach, a regulator might prefer another, and a clinician reading the published paper might assume a third. The numbers would differ, and no one could say which was “right” because no one had agreed on what they were trying to measure.

The framework fixes this by requiring trial designers to state their clinical question precisely, attribute by attribute, before the trial runs. Statisticians, clinicians, and regulators all work from the same definition. The statistical analysis plan then follows from the estimand, not the other way around. This prevents the common problem of choosing an analysis method first and only later realizing it doesn’t answer the question anyone actually cared about.

Since the guideline took legal effect in July 2020, regulatory submissions to the EMA and FDA are expected to clearly define their estimands. This has shifted how protocols are written, how statistical analysis plans are structured, and how trial results are interpreted during regulatory review.

A Practical Example

Consider a trial testing a new drug for type 2 diabetes, measuring blood sugar (HbA1c) at 6 months. Rescue insulin is available to any patient whose blood sugar rises dangerously. Using the five attributes, the trial team might define two different estimands for the same trial:

Estimand A uses a treatment-policy strategy: the population is all randomized patients, the variable is HbA1c at 6 months, rescue insulin use is acknowledged but the outcome is measured regardless, and the summary is the difference in mean HbA1c between groups. This tells a prescriber what to expect when they assign this drug in practice, knowing some patients will end up needing rescue insulin.

Estimand B uses a hypothetical strategy: same population, same variable, but now the question is what HbA1c would have been if rescue insulin hadn’t been available. This isolates the drug’s biological effect more cleanly but describes a scenario that wouldn’t happen in real clinical care.

Both are valid. They simply answer different questions, and the estimand framework ensures that everyone knows which question each analysis is answering.