How Can Machine Learning Be Used in Healthcare?

Machine learning is already embedded across healthcare, from reading medical scans to predicting which patients will end up back in the hospital within a month. The FDA has cleared over 1,400 AI-enabled medical devices as of early 2026, with the heaviest concentration in radiology, cardiovascular medicine, and gastroenterology. Here’s where the technology is making the most practical difference and where its limitations still matter.

Diagnosing Disease From Medical Images

The most mature use of machine learning in healthcare is analyzing medical images: X-rays, MRIs, CT scans, and pathology slides. Algorithms trained on millions of images can flag tumors, fractures, and other abnormalities, sometimes catching details that human eyes miss on a first pass.

In breast cancer detection, ML models analyzing MRI scans achieve a pooled sensitivity of 86% and specificity of 82%. That sensitivity is slightly lower than what radiologists achieve with conventional MRI reading (97%), but the specificity is meaningfully higher (82% vs. 69%). In practical terms, the algorithm is better at correctly ruling out cancer when it isn’t there, which means fewer unnecessary biopsies and less patient anxiety. The tradeoff is that it misses a small number of true cancers that a radiologist would catch. This is why these tools work best as a second set of eyes rather than a replacement: the algorithm flags what the radiologist might overlook, and the radiologist catches what the algorithm might miss.

Beyond radiology, ML models in pathology can classify tissue samples, and cardiovascular algorithms can detect irregular heart rhythms from electrocardiogram data. The bulk of FDA-cleared AI devices sit in these imaging-heavy specialties.

Speeding Up Drug Discovery

Bringing a new drug from initial concept to clinical trials traditionally takes four to six years and costs hundreds of millions of dollars just in the preclinical phase. Machine learning is compressing that timeline dramatically by simulating how molecules interact with biological targets, predicting which drug candidates will fail early, and narrowing the field before expensive lab work begins.

Insilico Medicine identified a brand-new drug target for a serious lung disease (idiopathic pulmonary fibrosis) and advanced a candidate into preclinical trials in 18 months at a reported cost of just $150,000, excluding lab validation. Exscientia partnered with a pharmaceutical company to develop a drug candidate for obsessive-compulsive disorder in under 12 months, making it the first AI-designed molecule to enter human clinical trials. These are still early examples, but they illustrate the scale of time and cost savings that ML can unlock by processing genomic, protein, and chemical data simultaneously.

Personalizing Cancer Treatment

Not every patient responds the same way to chemotherapy or immunotherapy, and choosing the wrong treatment first means weeks or months of side effects with no benefit. Machine learning models are being developed to predict which specific treatments a patient is most likely to respond to, based on their tumor characteristics, imaging data, and molecular profile.

Researchers have built models that rank available drugs for individual patients using protein-level data from their tumors, helping oncologists prioritize the most promising option. Other image-based models analyze CT scans taken shortly after treatment begins to predict whether a patient’s cancer is responding to immunotherapy or chemotherapy. In bladder cancer, breast cancer, and non-small-cell lung cancer, these tools can assess treatment response and flag early signs of recurrence or spread. None of this replaces an oncologist’s judgment, but it adds a data-driven layer that can catch patterns invisible to the human eye.

Predicting Hospital Readmissions

Hospitals face significant financial penalties when patients are readmitted within 30 days of discharge, and more importantly, those readmissions reflect real suffering for patients who weren’t stable enough to go home. Machine learning models trained on electronic health records can score each patient’s risk of bouncing back.

The results are mixed but promising. One study built an algorithm specifically for patients with chronic obstructive pulmonary disease and packaged it into a clinical app. When nurses used the app to identify high-risk patients and follow a tailored care plan, the COPD readmission rate in that group dropped by 48%. But another study found no change in overall 30-day readmission rates after deploying a similar tool. The difference often comes down to whether the prediction actually changes what clinicians do. A model that accurately identifies risk but doesn’t trigger a specific intervention is just an expensive alert.

Wearable Devices and Early Warning Systems

For chronic conditions like heart failure, the goal is catching a crisis before it lands someone in the emergency room. Wearable sensors now track physical activity, heart rate variability, cardiac electrical signals, and even fluid buildup in the lungs through chest-worn patches that measure how easily electrical signals pass through tissue. Machine learning algorithms running on this data can detect the slow deterioration that precedes a heart failure hospitalization.

Patients often show signs of worsening congestion three to seven days before they’re hospitalized. The LINK-HF study, which used a wristband collecting multiple physiological signals, predicted heart failure hospitalizations with 76 to 88% sensitivity and 85% specificity, generating alerts a median of 6.5 days before admission. Another system using a wearable defibrillator vest achieved a prediction window of 32 days, though with lower accuracy (69% sensitivity, 60% specificity). That tradeoff between how far ahead you can predict and how accurate the prediction is remains a core challenge. Still, even a few days’ warning gives clinicians time to adjust medications and potentially keep patients out of the hospital entirely.

Unlocking Data Trapped in Clinical Notes

Up to 80% of the information in electronic medical records is unstructured text: doctors’ notes, discharge summaries, pathology reports, and nursing documentation. This text contains critical details about symptoms, social circumstances, medication history, and disease progression that never make it into the structured, searchable parts of the record.

Natural language processing, a branch of machine learning focused on understanding human language, can read through these notes and extract specific data points automatically. A clinical registry tracking cancer outcomes, for example, might need the exact date a tumor was first identified, what stage it was, and what treatment was chosen. Manually pulling that information from thousands of patient charts takes enormous time and labor. NLP systems can do it at scale, using approaches ranging from rule-based keyword matching to newer large language models that can interpret context and ambiguity in clinical writing. This doesn’t just save time. It makes large-scale research possible by converting mountains of narrative text into analyzable data.

What’s Holding It Back

Despite the promise, several barriers keep machine learning from reaching its potential in healthcare, particularly in smaller or rural hospital systems that lack the technical infrastructure of major academic medical centers.

The biggest obstacle is interoperability. Patient data is scattered across departments, stored in different formats, and locked in legacy systems that don’t communicate with each other. A machine learning model trained at one hospital often can’t simply be plugged into another because the data looks different. Clinicians may record the same information in inconsistent ways, duplicate patient entries, or use outdated data standards. Without clean, standardized data flowing in, even the best algorithm produces unreliable results.

Scalability is another challenge. Many ML tools that perform well in research settings, where they process a fixed set of patient records retrospectively, struggle when asked to handle real-time data from hundreds or thousands of patients simultaneously. If a prediction arrives too late to influence a clinical decision, it has no value. Cloud-based systems can help with processing power, but they introduce their own security concerns, from traditional cybersecurity risks to the possibility that the model itself could leak patient information if queried by outside parties.

Bias Built Into the Data

Machine learning models learn from historical data, and healthcare data carries decades of inequity. One of the most widely cited examples involved a risk prediction algorithm used across U.S. hospitals that systematically underestimated the health needs of Black patients. The model used prior healthcare spending as a proxy for illness severity, but because Black patients had historically received less care due to systemic barriers, the algorithm interpreted their lower spending as better health. The result was that equally sick Black patients were scored as lower risk and received fewer resources.

This isn’t an isolated case. Sepsis prediction models developed in high-income hospital systems have shown significantly reduced accuracy for Hispanic patients due to underrepresentation in the training data. During the COVID-19 pandemic, a contact tracing app in India failed to reach populations without smartphones, leaving rural and low-income communities with less public health protection. These examples illustrate a consistent pattern: when the data used to train a model doesn’t reflect the full diversity of the population it serves, the model’s blind spots fall hardest on the people who already face the greatest barriers to care.