What Is the Critical Incident Technique (CIT)?

The critical incident technique (CIT) is a research method for collecting and analyzing specific moments of human behavior that had a significant positive or negative impact on an outcome. Originally developed in the 1940s to improve pilot selection during World War II, it has since become a widely used tool in fields ranging from healthcare and education to UX design and human resources. The core idea is simple: instead of asking people about their general impressions, you ask them to describe specific, memorable events and then look for patterns across those events.

Origins in Military Aviation

The technique grew out of the Aviation Psychology Program, established by the United States Army Air Forces in the summer of 1941 to develop better procedures for selecting and classifying aircrews. Researchers needed to understand why some pilots succeeded and others failed, but vague performance reviews weren’t cutting it. They needed concrete examples of what pilots actually did in real situations.

John Flanagan, the psychologist who led much of this work, formally published the method in 1954 in the journal Psychological Bulletin. He defined CIT as “a set of procedures for collecting direct observations of human behavior in such a way as to facilitate their potential usefulness in solving practical problems and developing broad psychological principles.” The wartime research showed that gathering factual, firsthand accounts of specific behaviors was far more useful than relying on general opinions about someone’s competence.

What Counts as a “Critical Incident”

Not every observation qualifies. For an event to be an “incident,” it must be a complete, observable activity that lets you draw reasonable conclusions about the person performing it. For that incident to be “critical,” two conditions apply: the purpose of the action must be clear to the observer, and the consequences must be definite enough that there’s little doubt about the effect. In other words, a critical incident is a moment where someone’s behavior clearly helped or clearly hurt the outcome.

The technique typically focuses on rare or nonroutine events: emergencies, high-stakes decisions, moments that were especially challenging, or experiences that stood out as unusually positive or negative. Routine, everyday actions are generally not the focus because they don’t reveal as much about what drives success or failure.

The Five Steps of the Process

Flanagan described CIT as “a flexible set of principles that must be modified and adapted to meet the specific situation at hand,” but the process follows five core steps:

Define the general aim. Before collecting any data, you need clarity on what you’re studying. What activity or role is being examined, and what does effective performance look like? This step sets the boundaries for everything that follows.
Develop a collection plan. Decide who you’ll interview or observe, what questions you’ll ask, and how you’ll record the incidents. The people closest to the action, typically the primary decision-makers during the incident, are the ones you want to talk to.
Collect the data. This usually happens through interviews, though questionnaires and direct observation also work. Participants describe the incident from beginning to end, including both physical events (alarms, system failures, visible actions) and cognitive details (what they were thinking and perceiving at the time).
Analyze the data. Raw accounts are sorted into categories and themes. Researchers read through the collected incidents, summarize key points, group similar points together, and gradually build a structure of recurring patterns. After reviewing every 10 to 15 accounts, the emerging categories are reviewed and refined.
Interpret and report. The final step involves identifying which themes recur across incidents, whether those themes can be organized into broader concepts, and how the findings connect to existing knowledge or goals.

How the Data Gets Analyzed

The analysis stage is where CIT moves from storytelling to structured insight. Researchers work through each collected incident, pulling out the key points and tagging each with a source identifier so findings can be traced back. Similar points get grouped together, and after the full set has been processed, the researcher reviews and reorganizes everything, placing the most frequently discussed themes at the top and less common ones at the bottom.

This can be a creative process. Researchers often construct new headings or subheadings that are more abstract or conceptual than the raw data, building higher-order categories that capture patterns across many individual stories. The second stage of analysis looks specifically for associations and explanations: which themes recur across incidents, how they connect to each other, and whether certain patterns apply more strongly to particular subgroups of participants. The goal is to move from individual descriptions to actionable patterns.

Reliability studies have shown the method produces stable results. In one validation study, by the time two-thirds of the incidents had been classified, 95% of all subcategories had already appeared, meaning additional data largely confirmed what was already there rather than introducing new themes. The category structure also held up regardless of who conducted the interviews or how the data was collected, and repeating the categorization process produced consistent subcategories.

How CIT Is Used in Healthcare

Healthcare organizations use CIT to investigate patient safety events, near misses, and challenging clinical decisions. The Agency for Healthcare Research and Quality includes it in their workflow assessment toolkit. In a clinical setting, the process might involve interviewing the nurse or physician who was the primary decision-maker during an adverse event, constructing a detailed timeline that captures both what happened physically and what the clinician was thinking at each stage, and then probing specific decision points to understand why certain choices were made.

This approach is particularly valuable for understanding the human factors behind errors, not just what went wrong mechanically but what the person perceived, assumed, or missed in the moment. Because CIT captures cognitive details alongside observable events, it surfaces problems that standard incident reports often miss.

Applications in UX Research

UX researchers have adopted CIT to identify the moments that most strongly shape how people feel about a product. Rather than asking users to rate their general satisfaction (which tends to produce vague responses like “it was okay” or “I found it user-friendly”), CIT forces participants to recall specific, memorable moments that made a real difference in their experience.

A typical CIT-based UX study asks participants three targeted questions about each standout moment: what happened (including when, where, and who was involved), what made the event particularly positive or negative, and what the short, medium, and long-term consequences were. Participants might describe up to three positive and three negative incidents. In a 2025 study using Netflix as a test case, researchers used this approach in a questionnaire followed by 12 interviews, finding that CIT revealed authentic, unfiltered problems and success factors that traditional usability tests and questionnaires would have overlooked.

The technique works well for UX because it’s exploratory. Users themselves decide what mattered most, which means unexpected needs and frustrations surface naturally. The collected incidents then form the basis for identifying broader patterns in what makes or breaks the user experience.

Strengths and Limitations

CIT’s biggest strength is its grounding in specifics. Because participants describe real events rather than general impressions, the data tends to be concrete and actionable. The method is also flexible enough to work across vastly different domains, from aviation safety to streaming service design, with the same basic framework.

The main limitation is that it depends heavily on memory. Participants are recounting past events, and their recall may be incomplete or shaped by how things turned out. The technique also focuses on extremes by design, capturing the best and worst moments while potentially missing the slow, cumulative factors that shape everyday performance. Finally, the analysis stage requires judgment. Categorizing incidents and building themes involves interpretive decisions, which is why validation steps like repeating the categorization process and checking whether new data produces new categories are built into rigorous applications of the method.