CMS data is the massive collection of healthcare information gathered by the Centers for Medicare & Medicaid Services, the federal agency that administers Medicare, Medicaid, the Children’s Health Insurance Program (CHIP), and the Health Insurance Marketplace. It covers enrollment records, medical claims, prescription drug details, provider quality ratings, and financial transparency reports for a population of nearly 70 million Medicare beneficiaries alone, plus tens of millions more in Medicaid and CHIP. Researchers, journalists, policymakers, and healthcare organizations use this data to study treatment patterns, measure hospital quality, track spending, and shape health policy.
What CMS Data Actually Includes
CMS collects information every time someone enrolled in a government health program sees a doctor, fills a prescription, or stays in a hospital. That information falls into several broad categories: enrollment and demographics, medical claims, prescription drug records, quality measures, and financial disclosures. Together, these datasets form one of the largest and most detailed pictures of healthcare delivery in the United States.
At the most basic level, enrollment files contain dates of Medicare coverage, the specific programs a person is enrolled in, demographic details like age, sex, race, and ZIP code, and indicators for chronic conditions including diabetes, heart failure, mental health disorders, and substance use disorders. This information provides the foundation that makes all other CMS data useful, because it tells researchers who is covered, where they live, and what health challenges they face.
Claims Data: The Core of CMS Records
The bulk of CMS data comes from claims, the billing records generated each time a healthcare provider delivers a service and seeks payment. These are organized by the different parts of Medicare.
- Part A (hospital and facility claims) captures summary information from hospitalizations, detailed inpatient claims, billing from skilled nursing facilities for post-hospital care, and hospice services.
- Part B (outpatient and physician claims) covers visits to community-based and hospital-based doctors’ offices, lab work, durable medical equipment like wheelchairs or oxygen tanks, and medications administered in clinics or infusion centers.
- Part D (prescription drugs) records the name of each medication dispensed, its dose, the quantity and days supplied, and cost and payment information.
These claims files are extraordinarily detailed. A single hospitalization generates records showing diagnosis codes, procedures performed, length of stay, and what Medicare paid. Multiply that across nearly 70 million enrollees, and you get a dataset that lets analysts track everything from opioid prescribing trends to regional differences in knee replacement rates.
Medicaid and CHIP Data
Medicaid and CHIP data flows through a separate system called the Transformed Medicaid Statistical Information System (T-MSIS), which standardizes reporting across all 50 states. Because Medicaid is jointly run by federal and state governments, each state historically collected data in its own format. T-MSIS brought those formats into alignment so the data could be compared nationally.
The T-MSIS dataset contains beneficiary eligibility and enrollment records, utilization and cost data, payment information, provider qualifications and affiliations, managed care plan participation details, and records of third-party liabilities like private insurance that offset Medicaid costs. This is particularly valuable for studying low-income populations, children’s health coverage, and how Medicaid expansion has affected access to care in different states.
Quality Ratings and Patient Experience
Beyond billing records, CMS collects and publishes quality data on hospitals, nursing homes, home health agencies, and other providers. If you’ve ever used Medicare’s Care Compare tool to look up a hospital’s star rating, you were browsing CMS quality data.
Hospital star ratings weigh measures like mortality rates, patient safety incidents, readmission rates, and timeliness of care. Patient experience scores come from the HCAHPS survey, a standardized questionnaire given to patients after a hospital stay that asks about communication with nurses and doctors, pain management, cleanliness, and discharge instructions. Home health agencies receive separate quality ratings based on eight measures of care processes and patient outcomes, plus their own patient survey scores. All of this is published publicly so patients can compare providers before choosing where to receive care.
Open Payments: Financial Transparency Data
CMS also runs the Open Payments program, a national disclosure database that tracks payments from drug and medical device companies to physicians and teaching hospitals. If a pharmaceutical company pays a doctor a consulting fee, covers their travel to a conference, or provides free meals, that transaction ends up in the Open Payments database.
Companies report these payments annually, covering the full calendar year from January through December. The data is submitted to CMS between February and March, then physicians get a 45-day window starting April 1 to review records attributed to them and dispute anything they believe is inaccurate. CMS publishes the finalized data by June 30 each year. The database also captures ownership or investment interests that physicians or their immediate family members hold in reporting companies. Anyone can search this data for free on the CMS website.
Public vs. Restricted Access
Not all CMS data is equally accessible. The agency releases information at three levels of detail, each with different privacy protections.
Public Use Files are freely available to anyone. They’ve been stripped of all information that could identify individual patients and generally contain aggregate-level statistics. No application, no fees, no review process required. These are useful for broad trend analysis but lack the granularity needed for serious research.
Limited Data Sets sit in the middle. They contain more detail than public files but still exclude direct identifiers like names and Social Security numbers. Researchers need a Data Use Agreement with CMS and must pay a fee. Processing typically takes two to three weeks.
Research Identifiable Files (RIFs) are the most detailed and most restricted. They contain beneficiary-level protected health information and allow researchers to request customized cohorts, such as all diabetic patients in a specific state, or to link CMS records with other datasets using a unique beneficiary identifier. Getting access requires a formal research application, a Data Use Agreement, and review by CMS’s Privacy Board to ensure only the minimum necessary data is requested. The process takes three to five months. CMS established the Research Data Assistance Center (ResDAC) specifically to help researchers navigate this process and provide guidance on which files to request.
How CMS Data Is Shared Electronically
CMS has been pushing to make its data more interoperable, meaning easier to exchange electronically between different health systems, insurance plans, and patient-facing apps. A 2020 federal rule requires health insurers participating in Medicare, Medicaid, and the Marketplace to build application programming interfaces (APIs) that let patients access their own claims and health data through third-party apps on their phones or computers.
These APIs use a technical standard called HL7 FHIR (Fast Healthcare Interoperability Resources), which is essentially a common language that allows different health IT systems to share data without custom translations. For patients, this means you can theoretically download your Medicare claims history, lab results, and other records into a health app of your choosing. For providers and insurers, it means smoother data exchange when patients move between plans or health systems.
Why CMS Data Matters
The sheer scale of CMS data makes it one of the most powerful tools in American health policy. With nearly 70 million people on Medicare as of December 2025, plus the Medicaid and CHIP populations, the dataset captures a significant share of all healthcare delivered in the country. Researchers use it to identify which treatments work best for older adults, track the spread of opioid prescriptions, evaluate whether hospitals are improving over time, and measure racial and geographic disparities in care. Health systems use it to benchmark their performance against national averages. Journalists use it to investigate overcharging, provider quality, and industry financial relationships.
For everyday people, the most immediately useful pieces of CMS data are the publicly searchable tools: Care Compare for checking hospital and nursing home quality, and Open Payments for seeing whether your doctor receives money from pharmaceutical or device companies. The underlying claims and enrollment databases power much of what we know about how American healthcare actually functions.

