What Is an Implicit Association Test? IAT Explained

The Implicit Association Test, or IAT, is a computer-based test that measures the strength of automatic mental associations between concepts, like race or gender, and evaluations like “good” or “bad.” First published in 1998 by researchers Anthony Greenwald, Debbie McGhee, and Jordan Schwartz, the test uses reaction time as a window into associations you may not be consciously aware of. It has since become one of the most widely discussed tools in psychology, taken by millions of people through Harvard’s Project Implicit website.

How the Test Works

The IAT is a sorting task. You sit at a computer and rapidly categorize words and images into groups using two keyboard keys. The test pairs four categories together in different combinations across multiple rounds. For example, a political attitudes IAT might ask you to sort images of Democrats and Republicans alongside “good” words (like joy, wonderful) and “bad” words (like terrible, awful).

In one round, you press the left key for “Democrats” and “good words,” and the right key for “Republicans” and “bad words.” In the next round, the pairings flip: now “Republicans” and “good words” share a key, while “Democrats” and “bad words” share the other. The test measures how quickly you sort in each pairing, down to the millisecond.

The core logic is simple: if two concepts are strongly linked in your mind, sorting them together is fast and easy. If they aren’t linked, the pairing feels awkward and slows you down. That difference in speed between the two rounds is what produces your score.

What the Score Means

Your result is calculated as a “D-score,” which represents the difference in your average response time between the two sorting conditions, adjusted for your overall speed variability. A D-score near zero suggests no strong automatic preference for either group. The further the score moves from zero in either direction, the stronger the measured association.

Results are typically reported in plain language categories: slight, moderate, or strong automatic preference for one group over another. Harvard’s Project Implicit describes these labels as based on scientific conventions for communicating the size of an effect, meant to give you a rough sense of the degree of bias the test detected during that sitting. The race IAT, for instance, might report that you showed a “moderate automatic preference for White people over Black people,” or vice versa. Other available tests measure associations related to gender, age, sexuality, weight, disability, and more.

What It Can and Cannot Tell You

A meta-analysis examining the IAT’s ability to predict discrimination looked across six categories of real-world outcomes: interpersonal behavior, person perception, policy preference, subtle nonverbal behavior, response time tasks, and brain activity. The test does show correlations with these outcomes at a group level, meaning that across large samples, higher bias scores tend to correspond with more biased behavior on average.

But there’s a crucial distinction between group-level patterns and individual diagnosis. The IAT’s test-retest reliability, meaning how consistent your score is if you take it again weeks later, averages around r = .50. That means roughly half of what your score reflects is stable over time, while the other half shifts between sessions. For comparison, a blood pressure cuff that gave you a different reading half the time would raise eyebrows. This level of reliability is adequate for research comparing groups or testing whether an intervention shifted average scores, but it’s shaky ground for telling any single person “this is your level of bias.”

Averaging multiple test sessions dramatically improves consistency. One dataset showed that averaging eight IAT scores collected over two years pushed reliability up to r = .89, a level considered strong enough for individual-level conclusions. But that’s eight tests, not one.

The Scientific Debate

The IAT is one of psychology’s most studied and most contested tools. Its internal consistency, a measure of whether the items within a single test session are measuring the same thing, averages around .80, which is solid. The problem is that single-session stability issue: your score on a Tuesday afternoon might look quite different from your score on a Thursday morning, influenced by what you recently watched, read, or experienced.

Critics argue this makes the test unreliable as a measure of any fixed “implicit attitude” inside your head. Supporters counter that some of that variability is meaningful, reflecting genuine fluctuations in how activated certain associations are at any given moment. The debate is unresolved, and both sides include prominent researchers in the field.

There’s also disagreement about what the test actually measures. Some researchers view it as a direct readout of unconscious prejudice. Others argue it captures familiarity with cultural stereotypes rather than personal endorsement of those stereotypes. You can be deeply aware that a stereotype exists, even find it repugnant, and still sort faster when that pairing comes up, simply because the association is culturally pervasive.

The IAT in Workplace Training

Many organizations use the IAT as part of diversity or bias-awareness training, but the test itself isn’t designed to reduce bias. It’s a measurement tool. The question is whether training programs built around it actually change anything.

The most rigorously studied program is a “bias habit-breaking” intervention developed at the University of Wisconsin. In a randomized controlled trial, participants who went through the training showed significant decreases in IAT scores compared to a control group, and this effect held at eight weeks. A replication study confirmed the pattern at six weeks. More notably, when a subset of participants was tested two to three years later on a separate task measuring how much effort they put into avoiding stereotypic assumptions, the training group still outperformed controls.

The most striking finding involved real hiring decisions. Academic departments randomly assigned to receive the training hired women as 47% of new tenure-track faculty in the two years following, compared to 32% in departments that didn’t receive it. That’s a 43% increase. These results stand out because most brief bias interventions studied in meta-analyses show little lasting effect. The researchers attribute the difference to the program’s focus on building long-term habits rather than simply raising awareness through a single session.

Taking the Test Yourself

Harvard’s Project Implicit offers free IATs covering a range of topics, including race, gender and science, age, weight, sexuality, and disability. Each test takes about 10 minutes. You’ll see your result immediately afterward, along with an explanation of what it means.

Going in, it helps to know what you’re getting: a snapshot of automatic associations as measured in that moment, not a permanent verdict on your character. Your score reflects a mix of personal attitudes, cultural exposure, and the mental state you happened to be in during those few minutes. Treat it as one data point for self-reflection rather than a definitive label.