How Often Is Science Wrong? What the Data Shows

Science gets things wrong more often than most people assume. Depending on the field, somewhere between 30% and 60% of published findings fail to hold up when other researchers try to reproduce them. That doesn’t mean science as a whole is broken, but it does mean that any single study, even one published in a top journal, has a real chance of being incorrect.

The Replication Crisis in Numbers

The most striking evidence comes from a landmark 2015 project that attempted to reproduce 100 psychology studies published in respected journals. Of those original studies, 97% had reported statistically significant results. When independent teams repeated the experiments using the same methods and materials, only 36% produced significant results again. Even by the most generous measure, combining data from the originals and the replications, just 68% held up. By the strictest standard, where independent evaluators judged whether the replication genuinely matched the original, only 39% passed.

Psychology isn’t uniquely troubled. A similar project tackled high-profile cancer biology papers from 2010 to 2012, attempting to replicate key experiments from 53 studies. For the positive findings (the kind that make headlines), only about 40% replicated successfully across multiple criteria. The results were better for null findings, where researchers originally reported no effect. Those held up about 80% of the time, which makes sense: it’s harder to accidentally find nothing.

Economics research fares similarly. A Federal Reserve analysis tried to reproduce results from 67 published papers across 13 leading journals. Without contacting the original authors, the team could replicate only 22 of them (33%). Even after reaching out to authors for help with data and code, the success rate climbed to just 49%. The researchers concluded bluntly that economics research “is usually not replicable.”

Why So Many Findings Are Wrong

In 2005, Stanford researcher John Ioannidis published what became one of the most cited papers in medical history, arguing mathematically that most published research findings are false. His reasoning centers on how studies are designed and how results get published. Most researchers use a statistical threshold (called a p-value) of 0.05, meaning they accept a 5% chance of a false alarm. That sounds small, but when thousands of studies test unlikely hypotheses with small sample sizes, false positives pile up fast. The studies that find exciting, positive results are far more likely to get published than the ones that find nothing, creating a published literature that’s skewed toward overstatement.

Ioannidis showed that the reliability of a finding depends heavily on context. A well-designed, large randomized trial testing a plausible hypothesis is true about 85% of the time. But a small exploratory study testing a long-shot idea in a trendy field might be wrong more often than it’s right. Most published research falls somewhere in between, and the pressures of academic publishing push it toward the unreliable end of that spectrum.

Medical Practices That Turned Out Wrong

These aren’t just abstract statistical problems. They translate directly into real-world medical care. A study published in Mayo Clinic Proceedings reviewed a decade of articles from one of medicine’s most prestigious journals, looking specifically at papers that tested whether existing, standard medical practices actually worked. Of 363 such articles, 146 (about 40%) found that the established practice was ineffective or inferior to a simpler alternative. Only 38% reaffirmed the current standard. The rest were inconclusive.

This phenomenon, called “medical reversal,” means that roughly 4 in 10 accepted treatments your doctor might recommend have later been contradicted by better evidence. Hormone replacement therapy for heart protection, routine stenting for stable chest pain, and certain arthroscopic knee surgeries are well-known examples of practices that were widely adopted, then found to be no better (or worse) than doing less.

Drug Development as a Reality Check

The pharmaceutical industry provides another lens on how often early science turns out to be wrong. Of drugs that enter the first phase of human testing, only about 14% ultimately win FDA approval. That means roughly 86% of compounds that looked promising enough to test in people still failed somewhere along the way, often because the basic science they were built on didn’t translate to human biology as expected. Across 18 major pharmaceutical companies studied over a 16-year period, that success rate ranged from 8% to 23%.

The cost of this is staggering. An analysis published in PLOS Biology estimated that more than 50% of preclinical research in the United States is irreproducible, translating to roughly $28 billion per year spent on lab findings that other scientists cannot replicate. Even recovering half of that waste through better research practices would free up $14 billion annually.

How Science Fixes Itself (Slowly)

If science is wrong this often, why trust it at all? Because science has a built-in correction mechanism that no other way of knowing the world can match. The replication projects described above are themselves science. Researchers identified the problem, measured it, and published the results. That’s the system working, even if it works slowly.

Retractions are another part of the correction process. More than 10,000 research papers were retracted in 2023 alone, a new record. That number sounds alarming, but it partly reflects better detection tools and a growing willingness among journals to pull flawed work. A decade ago, many of those papers would have quietly stayed in the literature.

The practical takeaway is about calibrating your confidence. A single study, no matter how widely reported, is a provisional answer. When multiple independent teams reproduce a finding using different methods and populations, the odds it’s correct rise dramatically. The findings that have been tested dozens or hundreds of times (vaccines prevent disease, smoking causes cancer, the Earth is warming) are as close to certain as human knowledge gets. The flashy new result from one lab that just hit the news? Treat it as interesting but unproven. The distance between a single study and settled science is where most of the errors live.