What Is the Sally-Anne Test? Theory of Mind Explained

The Sally-Anne test is a simple puppet scenario used to measure whether a child can understand that another person holds a belief different from their own. It’s one of the most widely used tools in developmental psychology for assessing what researchers call “theory of mind,” the ability to recognize that other people have their own thoughts, knowledge, and perspectives. The test became famous after a 1985 study showed that 80% of children with autism struggled with it, while most neurotypical children and children with Down syndrome passed easily.

How the Test Works

The setup involves two dolls or puppets, Sally and Anne, acting out a short scene for a child. Sally has a basket, and Anne has a box. Sally places a marble in her basket, closes it, and leaves the room. While Sally is gone, Anne takes the marble out of Sally’s basket and puts it in her own box. Sally then comes back, and the child is asked: “Where will Sally look for her marble?”

The correct answer is the basket, because that’s where Sally left it and she has no way of knowing it was moved. But to get this right, a child has to do something cognitively demanding: set aside what they know (the marble is in the box) and think about what Sally knows (she put it in the basket and never saw it move). A child who answers “the box” is going by their own knowledge of reality rather than reasoning about Sally’s perspective.

This distinction is what makes the test so useful. It cleanly separates children who can mentally step into someone else’s shoes from those who assume everyone shares the same information they have.

What It Measures

The Sally-Anne test measures what psychologists call “first-order false belief understanding.” That’s the ability to recognize that someone else can hold a belief that is factually wrong, and to predict their behavior based on that wrong belief rather than on reality. This is a foundational piece of social cognition. Without it, you can’t fully understand why people act the way they do when they’re missing information you have.

Theory of mind is broader than just this one skill. It includes understanding emotions, intentions, desires, and more complex social reasoning. But false belief understanding is considered a key milestone because it requires a child to hold two competing representations in mind at once: what is actually true, and what another person thinks is true. The Sally-Anne test isolates that specific ability in a way that’s easy to observe and score.

When Children Typically Pass

Most neurotypical children begin passing the Sally-Anne test around age 4. Before that, children consistently point to the marble’s current location (the box), suggesting they can’t yet separate their own knowledge from Sally’s. This isn’t a lack of intelligence. It reflects a stage of brain development where several abilities, including working memory, language comprehension, and the capacity for abstract reasoning, haven’t yet come together in the right way.

A significant cognitive shift happens around this age. Children younger than 4 perform below chance on the task, meaning they’re not just guessing but are systematically drawn to the wrong answer. After age 4, the pattern flips, and most children reliably point to the basket. Cross-cultural research has found that performance on the Sally-Anne test doesn’t vary significantly by country of origin, suggesting this developmental milestone is relatively universal rather than shaped by cultural context.

The Original 1985 Autism Study

The test gained its prominence from a landmark study by Simon Baron-Cohen, Alan Leslie, and Uta Frith. They gave the task to three groups: children with autism, children with Down syndrome, and neurotypical children. The results were striking. 85% of neurotypical children and 86% of children with Down syndrome passed. Only 20% of children with autism passed.

This finding was pivotal because it suggested that the social difficulties seen in autism weren’t simply a product of general intellectual disability. Children with Down syndrome, who had lower IQ scores on average than the autistic group, passed at the same rate as neurotypical children. Something more specific was going on: a particular difficulty with representing other people’s mental states. The study helped launch decades of research into theory of mind as a core feature of autism.

How It Compares to Similar Tests

The Sally-Anne test isn’t the only false belief task. Another well-known version is the “unexpected contents” task, sometimes called the Smarties test. In that version, a child is shown a candy box (like a Smarties tube) that actually contains pencils. After discovering the surprise, the child is asked what another person, who hasn’t looked inside, would think is in the box. A child with false belief understanding says “candy.” A child without it says “pencils.”

The two tasks test the same underlying ability but differ in important ways. In the Sally-Anne test, both possible answers (basket and box) are physically present in the scene, so the child chooses between two visible locations. In the Smarties test, only the wrong answer (pencils) is sitting in front of them, while the correct answer (candy) exists only as a concept. The Sally-Anne test also requires following a narrative with multiple characters and events, while the Smarties test is more direct and personal. These structural differences mean children sometimes pass one but not the other, which tells researchers something about the role of memory and attention in these tasks.

Limitations Worth Knowing

The Sally-Anne test is elegant in its simplicity, but it’s not a perfect measure. One major concern is that it demands more than just theory of mind. A child needs to understand the verbal question being asked, remember the sequence of events, and hold multiple pieces of information in working memory. A child who fails might lack theory of mind, or they might simply be struggling with the language or memory demands of the task. This is especially relevant for young children and for children with communication disorders, where language ability could easily be the bottleneck rather than social understanding.

There’s also evidence that the test might not capture everything about how children respond in real social situations. Research analyzing video recordings of children taking the test found that kids sometimes demonstrate understanding in ways that don’t fit the formal scoring criteria. A child might give a “wrong” verbal answer while looking at the correct location, or respond to subtle cues from the tester rather than reasoning about Sally’s beliefs. These interactional nuances suggest that a pass-or-fail score can oversimplify what’s actually happening in a child’s mind.

For these reasons, the Sally-Anne test is best understood as one piece of a larger assessment rather than a standalone diagnostic tool. It reveals something meaningful about a child’s social reasoning, but it doesn’t tell the whole story on its own.