What Is Real World Data? Sources, Uses & Limits

Real world data (RWD) is health information collected during routine medical care rather than in controlled research settings. The FDA defines it as “data relating to patient health status or the delivery of health care routinely collected from a variety of sources.” Every time a doctor updates your medical record, a pharmacy fills a prescription, or a fitness tracker logs your heart rate, that information can become real world data. It matters because it’s increasingly shaping which treatments get approved, which drugs stay on the market, and how medicines are monitored for safety after they reach patients.

RWD vs. Real World Evidence

These two terms get used interchangeably, but they mean different things. Real world data is the raw information: your lab results, insurance claims, prescription records. Real world evidence (RWE) is what emerges when researchers analyze that data to draw conclusions about how a medical product actually performs. Think of RWD as the ingredients and RWE as the finished meal. A database of electronic health records is RWD. A study that mines those records to show a medication lowers hospital readmissions by 15% is RWE.

Where Real World Data Comes From

RWD is pulled from sources that already exist in the healthcare system, which is part of what makes it useful and part of what makes it messy.

Electronic health records (EHRs): The digital charts doctors use during appointments. These contain diagnoses, lab results, imaging reports, medications, and clinical notes.
Medical claims and billing data: Filed by hospitals, clinics, and pharmacies to insurers. Claims data reveal what procedures patients received, what drugs were dispensed, and how often they visited a provider, though they typically don’t include test results or clinical outcomes.
Disease and product registries: Organized databases that track patients with a specific condition (like a cancer registry) or patients using a specific device or drug.
Digital health technologies: Wearables and health apps that capture heart rate, sleep quality, step counts, blood glucose, blood pressure, oxygen levels, mood, medication adherence, and even seizure tracking. Consumer devices now allow people to generate continuous health data outside of any clinical setting.
Patient-reported outcomes: Surveys and questionnaires where patients describe their symptoms, side effects, quality of life, and treatment satisfaction.

The range is broad. A single patient might contribute RWD through their smartwatch, their doctor’s EHR, their insurance claims, and a patient registry, all without enrolling in a study.

How RWD Differs From Clinical Trial Data

Traditional randomized controlled trials (RCTs) remain the gold standard for proving a treatment works. But RCTs operate under strict conditions that don’t always reflect how medicine is practiced. Understanding those differences explains why RWD fills a gap.

Clinical trials recruit highly selective patient groups. Participants often must meet narrow criteria: a specific age range, no other major health conditions, no conflicting medications. That makes results clean and interpretable, but it also means the people in the trial may not represent the broader population who will actually use the treatment. Elderly patients, people with multiple chronic conditions, and pregnant women are frequently excluded. RWD, by contrast, captures information from everyone receiving care, including the patients trials leave out.

The setting is different too. In an RCT, treatments follow a fixed protocol and patients are monitored continuously. In the real world, doctors adjust doses, patients skip medications, and follow-up visits happen on irregular schedules. RWD reflects these variable patterns, which makes it better at showing how a treatment performs under typical conditions rather than ideal ones.

Cost and time also play a role. RCTs are expensive and can take years to complete. Studies using existing health records or claims databases cost less and can be completed faster, since the data already exists. This is especially valuable for rare diseases, where recruiting enough trial participants is difficult, or for long-term safety monitoring, where waiting a decade for RCT results isn’t practical.

How Regulators Use It

The FDA has been steadily expanding its framework for incorporating RWD into regulatory decisions. In December 2025, the agency issued updated guidance on using real world evidence to support regulatory submissions for medical devices, replacing earlier guidance from 2017. The document lays out how the FDA evaluates whether RWD is high enough quality to generate evidence that can influence approval decisions.

On the drug side, the FDA uses RWE to monitor safety after a product reaches the market, to support approval of new uses for existing drugs, and in some cases to serve as a comparison group in clinical trials. Rather than recruiting a separate control arm of patients receiving a placebo, researchers can use historical RWD from similar patients to establish a baseline. This approach, sometimes called a synthetic control arm, can accelerate trials for serious diseases where giving patients a placebo would be unethical.

Europe has built its own infrastructure. The European Medicines Agency launched DARWIN EU in 2022, a network that now draws on roughly 280 million patient records across about 40 data partners. All data is converted into a standardized format so it can be analyzed consistently. The EMA uses this network to support labeling changes, monitor drug safety, assess post-approval study feasibility, and prepare for public health emergencies. In March 2025, the European Health Data Space regulation entered into force, further formalizing how health data can be shared and used across the EU.

Practical Applications Beyond Approval

Drug approval is just one use. Once a medication or device is on the market, RWD becomes the primary way to track its performance over time. Clinical trials typically follow patients for months to a few years. RWD can capture outcomes over decades. If a drug causes a rare side effect that only appears after five years of use, claims data and EHRs are far more likely to catch it than the original trial.

Hospitals and health systems use RWD to compare the effectiveness of different treatment approaches for the same condition, identify which patient populations respond best to specific therapies, and track outcomes across facilities. Insurers use it to evaluate whether expensive treatments deliver results that justify their cost. Public health agencies use it to monitor disease trends, vaccine uptake, and medication shortages in near-real time.

The growing role of wearables and health apps is expanding what counts as RWD. Researchers studying epilepsy, for example, have identified heart rate, sleep quality, body movement, breathing rate, mood, and concentration as particularly useful data types when collected through consumer devices. In oncology, patient-reported side effects, quality of life scores, and treatment satisfaction add dimensions that clinical records alone miss. These streams of continuous, patient-generated data capture health between appointments, not just during them.

Challenges and Limitations

RWD’s biggest advantage is also its biggest problem: it comes from the real world, which is messy. Several challenges limit how much regulators and researchers can trust it.

Data standardization remains a major gap. Different hospitals, countries, and software systems store health information in different formats, using different coding systems and terminology. This makes it difficult to combine data across institutions without significant cleanup. DARWIN EU addresses this in Europe by requiring all partners to convert data into a common format, but globally, standardization is still uneven.

Incomplete data is common. A patient might see one specialist in one health system and a different doctor in another, with neither record capturing the full picture. Claims data can show whether a patient had a test ordered but often won’t include the result. EHRs might be missing information that was discussed verbally but never documented. These gaps can distort study findings.

Selection bias is considered the most significant risk in RWD analysis. Unlike a randomized trial, where patients are assigned to treatment groups by chance, real world treatment decisions are influenced by a patient’s severity, preferences, insurance coverage, and geography. Sicker patients might receive a more aggressive drug, making that drug appear less effective in RWD simply because its users started out worse. Researchers use statistical techniques to adjust for these differences, but no method fully eliminates the problem.

Privacy adds another layer of complexity. Health data is sensitive, and regulations governing its use vary by country and sometimes by state. Linking records across databases, which is often necessary to build a complete patient picture, raises both technical and ethical questions about de-identification and consent.

Why It Matters for Patients

If you’ve ever wondered whether a treatment that worked in a clinical trial will work for someone like you, RWD is the mechanism increasingly being used to answer that question. Trials often exclude older adults, people with multiple conditions, and underrepresented ethnic groups. RWD draws from broader populations, so it can reveal whether a drug’s benefits hold up across the diversity of people who actually take it. It can also surface safety signals faster than waiting for the next scheduled trial review. As health systems, regulators, and researchers build better infrastructure to collect and standardize this data, the evidence base for medical decisions grows more representative of how health care actually happens.