A single subject design is a research method that tracks one person (or a small number of people) over time to test whether an intervention actually works. Instead of comparing a treatment group to a control group the way traditional experiments do, a single subject design uses the same person as their own control, measuring behavior before, during, and sometimes after an intervention. It’s widely used in education, behavioral psychology, speech therapy, and increasingly in personalized medicine.
How Single Subject Designs Work
The core logic is straightforward: measure a behavior repeatedly, introduce an intervention, and see if the behavior changes in a predictable way. Every single subject design begins with a baseline phase, where the researcher collects data on the target behavior before any intervention starts. This baseline serves as the comparison point for everything that follows.
For the baseline to be useful, it needs two qualities. First, the data should be stable, meaning the behavior isn’t bouncing wildly from one measurement to the next. Stable data make it possible to predict where future data points will fall. Second, the baseline shouldn’t already show a clear trend of improvement. If someone’s behavior is already getting better on its own, it becomes difficult to credit any change to the intervention. By convention, researchers collect a minimum of three baseline data points to establish stability, though more is better.
Once the intervention begins, the researcher continues collecting data at regular intervals and compares the new data to the baseline. They look for changes in three things: the overall amount of behavior (level), whether the data are trending upward or downward over time (trend), and how much the data points bounce around (variability). A clear shift in one or more of these parameters suggests the intervention is having an effect.
The key to making this convincing is replication within the study itself. Every time the intervention is introduced, withdrawn, or staggered across different conditions, it creates another opportunity to demonstrate that the change wasn’t a coincidence. This built-in replication is what gives single subject designs their scientific credibility.
Common Types of Single Subject Designs
Reversal (ABAB) Design
The reversal design is the most intuitive version. The letters represent alternating phases: A is the baseline (no intervention), and B is the treatment. In a basic ABA design, a researcher measures the baseline, introduces the treatment, then removes it to see if the behavior returns to baseline levels. The stronger version, ABAB, adds the treatment back a second time. If the behavior improves during both B phases and drops during both A phases, that pattern is strong evidence the intervention caused the change. In one classic study using this design, a student’s study time was low during the first baseline, increased during the first treatment phase, dropped again when the treatment was removed, and rose once more when the treatment was reintroduced.
The obvious limitation is that some interventions can’t be “unlearned.” If you teach someone a new skill, removing the intervention doesn’t erase the skill from their memory. Reversal designs work best when the target behavior is expected to change direction when the treatment stops.
Multiple Baseline Design
When reversing an intervention isn’t practical or ethical, researchers often turn to multiple baseline designs. Instead of removing the treatment, this approach introduces it at staggered time points across different people, different behaviors, or different settings. All baselines are measured simultaneously, but the intervention is applied to each one at a different time.
The logic is elegant: if a behavior only changes after the intervention is introduced for that specific baseline, and the other baselines remain stable until their turn, it’s hard to argue the change was caused by anything other than the treatment. Staggering across more than one dimension (for example, across both participants and settings) provides additional opportunities to demonstrate that the intervention, not some outside factor, is driving the results.
Alternating Treatments Design
This design compares two or more interventions within the same person by rapidly switching between them. Rather than testing one treatment at a time across long phases, alternating treatments designs assign different interventions to different sessions in a random or counterbalanced order. The researcher then compares the data series for each treatment to see which one produces better outcomes. This is especially useful in clinical situations where a traditional no-treatment baseline isn’t feasible or ethical, since the initial alternating phase functions as its own comparison condition.
How Researchers Analyze the Data
Unlike large group studies that rely on statistical tests, single subject research depends primarily on visual analysis. Researchers graph the data from each phase and look for visible changes in level, trend, and variability between phases. A change that appears when the intervention is introduced, disappears when it’s withdrawn, and reappears when it’s reintroduced is considered strong evidence of a causal relationship.
Visual analysis isn’t just eyeballing a graph, though. Researchers follow systematic protocols that examine each phase individually and then compare adjacent phases. They look for how quickly the change happens after the intervention starts (immediacy), how large the shift is, and whether the pattern replicates across multiple opportunities within the study. The What Works Clearinghouse, which evaluates education research for the U.S. Department of Education, uses a structured review process where trained analysts classify studies as meeting standards, meeting standards with reservations, or not meeting standards, and then rate the strength of evidence as strong, moderate, or none.
Quantitative measures supplement visual analysis. One common metric is the Percentage of Nonoverlapping Data (PND), which calculates how many data points in the treatment phase fall outside the range of the baseline data. A PND of 90% or higher is considered very effective, 70% to 89% is effective, 50% to 69% is questionable, and below 50% suggests the treatment had little effect. Other overlap measures exist as well, but PND remains the most widely cited.
Strengths and Limitations
Single subject designs solve a problem that group studies can’t: they reveal what works for a specific individual. In a randomized controlled trial, treatment effects are averaged across dozens or hundreds of participants, which can mask the fact that some people improved dramatically while others didn’t respond at all. Single subject designs make individual variability visible rather than hiding it in a group mean.
They’re also practical for situations where large samples simply aren’t available. Rare conditions, highly specialized populations, or individualized interventions (like a customized speech therapy protocol) are difficult to study with traditional group designs. When you only have a handful of participants, a well-designed single subject study can still produce rigorous evidence.
The main limitation is generalizability. Because the study focuses on one person or a small number, the results don’t automatically apply to a broader population. Researchers address this through replication: repeating the study with different people, in different settings, or with variations of the intervention. Each successful replication builds confidence that the findings extend beyond the original participant. This accumulation of replicated single subject studies is how the field establishes broader evidence for a treatment’s effectiveness.
Another concern is rater bias. Research has shown that when the person delivering an intervention also collects the data, their knowledge of the study’s goals and current phase can influence their ratings. One study found that ratings from interventionists who knew what phase they were in differed from ratings by external observers who were unaware of the study conditions. Adding data collection by independent raters who don’t know when the intervention started helps protect against this.
N-of-1 Trials in Medicine
The medical version of single subject design is the N-of-1 trial, where an individual patient serves as the sole unit of observation. The goal is to determine the best treatment for that specific person using objective, data-driven criteria rather than relying solely on population-level evidence from large clinical trials.
In practice, an N-of-1 trial typically alternates between an active treatment and a placebo (or between two competing treatments) over multiple cycles, with the patient and often the clinician blinded to which is which. This approach has been applied across a striking range of conditions: chronic pain, ADHD, osteoarthritis, migraines, sleep disturbances, and depression, among others. In a study of 64 ADHD trials using stimulant medications, 28 led to a change in the patient’s treatment plan based on the individualized results. A chronic pain study found that 28 out of 34 patients achieved benefit from a treatment that might not have been prescribed based on group-level data alone.
N-of-1 trials address a fundamental tension in medicine: clinical research tells you what works on average, but your doctor needs to know what works for you. These trials explore individual response variability in a structured way, often revealing that the “best” treatment for a given patient isn’t what population data would have predicted.

