What Is Minimum Detectable Effect in Statistics?

The minimum detectable effect (MDE) is the smallest real effect that a study or experiment is designed to reliably pick up. It’s a planning tool: before you run a study, you calculate the MDE to make sure your test is sensitive enough to catch a meaningful difference if one exists. If your MDE is larger than the effect you care about, your study is essentially blind to the thing you’re trying to measure.

MDE comes up most often in two contexts: clinical trials testing whether a treatment works, and A/B tests in tech and marketing measuring whether a change improves a metric like conversion rate or revenue. In both cases, the logic is the same. You’re asking, “Given my sample size and how much noise is in my data, what’s the smallest improvement I’d be able to detect?”

How MDE Works in Practice

Think of MDE as a sensitivity threshold. A study with an MDE of 5% can reliably detect a true effect of 5% or larger, but it would likely miss a true effect of 3%. The word “reliably” here has a specific meaning: it refers to statistical power, which is the probability that your test will correctly flag a real effect as statistically significant.

The standard convention is to set power at 80%, meaning you want an 80% chance of detecting the effect if it truly exists. The other key setting is the significance level (alpha), typically 0.05, which caps the probability of a false positive at 5%. These two thresholds, while somewhat arbitrary, are used across nearly all fields as a shared convention.

With those parameters locked in, MDE becomes a function of two things: your sample size and how much variation exists in your data. More variation in your outcome (say, huge swings in how much individual customers spend) makes it harder to spot a signal, pushing MDE up. A larger sample size brings MDE down by giving you more data to average out that noise.

The Relationship Between MDE and Sample Size

MDE and sample size have an inverse relationship, but it’s not one-to-one. MDE shrinks in proportion to the square root of the sample size. That means doubling your sample doesn’t cut your MDE in half. To halve your MDE, you need roughly four times as many participants or observations.

This is why MDE calculations matter so much during the planning phase. If you need to detect a 1% improvement in conversion rate but your current traffic only supports detecting a 5% change, you know before spending any time or money that the test won’t be sensitive enough. You can then decide to run the test longer, find more traffic, or accept that you’re only going to catch larger effects.

Researchers use the same logic. If a study’s MDE is too large relative to the effect a program is realistically expected to produce, they’ll increase the sample to reduce the MDE to a useful range. Without this step, a study could easily miss a real, meaningful effect simply because it wasn’t designed to be sensitive enough.

Absolute vs. Standardized MDE

MDE can be expressed in two ways. Absolute MDE uses the raw units of whatever you’re measuring: a 2-point increase in test scores, a 0.5% lift in click-through rate, a $3 difference in average order value. This is the most intuitive form and the one you’ll use when making business or clinical decisions.

Standardized MDE expresses the effect as a fraction of the standard deviation of your outcome. If the standard deviation of test scores is 10 points and your MDE is 2 points, the standardized MDE is 0.2. This is essentially the same concept as Cohen’s d, a common measure of effect size. Standardized MDE is useful when comparing sensitivity across studies or across different metrics that use different units, but it’s less practical for day-to-day decision-making.

Why Outcome Type Matters

The type of metric you’re measuring changes how MDE behaves. Continuous outcomes like revenue, time on page, or blood pressure have variance that depends on how spread out individual values are. Binary outcomes, things that are either yes or no (did the user convert? did the patient recover?), have variance tied directly to the baseline rate.

Binary outcomes with very low or very high baseline rates are particularly tricky. When only 1% of users convert, for example, the data is extremely skewed, and detecting a small relative change requires a much larger sample than you’d need for a metric where the baseline rate is closer to 50%. If your baseline conversion rate is 2% and you want to detect a lift to 2.2%, you’ll need far more data than if your baseline were 30% and you wanted to detect a move to 30.2%, even though the absolute difference is the same.

What Happens When MDE Is Too High

An MDE that’s too high means your study is underpowered. The consequences go beyond just “we might not get a significant result.” Underpowered studies are at elevated risk of missing important signals entirely. A non-significant result in an underpowered study doesn’t mean the treatment or change had no effect. It means the study wasn’t capable of seeing effects that small. But readers and decision-makers often interpret non-significant results as evidence that nothing happened, which can discourage further investigation into interventions that actually work.

Even worse, if an underpowered study is used as a screening step to decide which ideas deserve further testing, real improvements get filtered out. A genuinely effective treatment or product change gets shelved because the test that evaluated it was too blunt to detect the benefit. This is one of the most common and most costly mistakes in both clinical research and product experimentation.

MDE Before vs. After the Study

MDE is primarily a pre-study planning tool. You calculate it before collecting data to make sure your design is adequate. But it also has a role after the study is complete.

After a study, some researchers compute “ex-post power” using the observed effect size and standard error. This practice is widely discouraged because it’s circular: it uses the noisy estimate from the study to evaluate the study’s own ability to detect that estimate. A better approach is to report the ex-post MDE, which uses the realized sample size and estimated standard error but doesn’t depend on the treatment effect estimate itself. This tells readers, “Given the data we actually collected, here’s the smallest effect we could have reliably detected,” which is far more informative than a post-hoc power calculation.

The distinction matters because a study might return a non-significant result with a point estimate that looks promising. Reporting the ex-post MDE lets readers judge whether the study was sensitive enough to detect a clinically or practically meaningful effect, or whether it simply lacked the power to do so.

Setting the Right MDE for Your Context

Choosing an MDE isn’t purely a statistical exercise. It’s a judgment call about what effect size would be meaningful enough to act on. In a clinical trial, that might mean the smallest improvement in symptoms that patients would actually notice. In an A/B test, it might be the smallest revenue lift that justifies the engineering cost of shipping a feature.

Once you’ve decided on a meaningful effect size, you work backward through the formula to figure out the sample size you need. If the required sample is larger than what’s feasible, you have three options: accept a higher MDE and acknowledge you’ll only catch bigger effects, reduce noise in your data through better measurement or stratification, or find a way to increase your sample. There’s no way around the tradeoff. Sensitivity costs data, and pretending otherwise leads to underpowered studies that waste resources without producing actionable answers.