How Do They Test Your IQ: Process and Scoring

IQ is tested through a one-on-one session with a licensed psychologist who guides you through a series of timed and untimed tasks designed to measure different aspects of how your brain processes information. The whole process typically takes about 60 to 90 minutes for the core battery, though it can stretch longer depending on the test and purpose. There’s no single “IQ quiz.” Professional IQ testing involves standardized instruments that have been refined over decades and normed against thousands of people in your age group.

What Happens During the Test

You sit across from a psychologist in a quiet room, usually at a desk or table. There’s no written exam you fill out on your own. The examiner reads instructions aloud, presents tasks one at a time, and records your responses. Some tasks are verbal: you’ll be asked to define words, explain how two concepts are similar, or answer general knowledge questions. Others are hands-on: arranging blocks to match a pattern, identifying what’s missing from a visual sequence, or solving puzzles without any words at all.

The tasks get progressively harder. You’ll start with easier items and move forward until you hit a ceiling where you can no longer answer correctly. Some subtests are timed, meaning speed matters. Others care only about whether you get the right answer, regardless of how long it takes. The examiner keeps a neutral tone throughout and won’t tell you whether your answers are right or wrong.

For the most widely used adult test, the full core battery takes roughly 60 to 65 minutes. Children’s versions and more comprehensive batteries can take longer, sometimes up to two hours with breaks. The psychologist scores everything afterward using standardized tables, and you typically receive results in a follow-up session or written report.

The Major IQ Tests Used Today

The gold standard for adults is the Wechsler Adult Intelligence Scale, now in its fifth edition (WAIS-5), published in 2024. It replaced the previous version after 16 years. For children ages 6 to 16, the equivalent is the Wechsler Intelligence Scale for Children (WISC). Both produce what’s called a Full Scale IQ, or FSIQ, which is the number most people think of as “your IQ.”

The Stanford-Binet Intelligence Scales, now in its fifth edition, is the other major test. It uses 10 subtests to assess five areas of thinking: fluid reasoning, knowledge, quantitative reasoning, visual-spatial processing, and working memory. Each area is tested through both verbal and nonverbal tasks. The nonverbal portions are particularly useful for people with limited English, hearing impairments, or learning disabilities. The Stanford-Binet also offers a brief version using just two subtests (a matrix reasoning task and a vocabulary task) to produce a quick estimate of IQ.

In schools, the Woodcock-Johnson Tests of Cognitive Abilities is commonly used to evaluate students for learning disabilities or giftedness. It separates cognitive ability from academic achievement, helping educators figure out whether a student’s struggles come from how they think or from gaps in what they’ve been taught.

What the Subtests Actually Measure

The WAIS breaks your score into four main areas, each tested by multiple subtests.

Verbal Comprehension measures your vocabulary, general knowledge, and ability to see how concepts relate. You might be asked what a word means, how two things are alike, or to answer factual questions.
Perceptual Reasoning (called visual-spatial and fluid reasoning in newer versions) tests your ability to analyze visual patterns, solve novel problems, and think logically without relying on language. Tasks include arranging blocks to match a design, completing visual puzzles, and identifying rules in a matrix of shapes.
Working Memory tests how well you hold and manipulate information in your head. A common task is hearing a string of numbers and repeating them back, sometimes in reverse order or rearranged from smallest to largest.
Processing Speed measures how quickly you can scan simple visual information and make decisions. You might rapidly match symbols to numbers or identify which symbols in a row match a target.

Each of these four areas produces its own index score. Together, they combine into the Full Scale IQ. This means your overall number reflects a blend of very different skills. Two people with the same FSIQ can have very different cognitive profiles underneath.

How Scores Work

IQ scores are built around an average of 100, with a standard deviation of 15. That means about two-thirds of all people score between 85 and 115. The classifications used in clinical settings break down like this: 90 to 109 is considered average, 120 to 129 is superior, and 130 or above is very superior. A score of 115 puts you one standard deviation above the mean, meaning you performed better than roughly 84% of the population.

Your raw score on each subtest (how many you got right, how fast you were) gets converted into a scaled score based on how other people your age performed when the test was developed. This age-norming is important: a 75-year-old isn’t compared against a 25-year-old. The norms come from large standardization samples, often several thousand people carefully selected to represent the broader population by age, education level, and geographic region.

Why Online Tests Don’t Count

Free IQ tests you find online are not standardized instruments. They haven’t been normed against representative populations, they aren’t administered under controlled conditions, and no one is monitoring whether you look up answers or get help. They also tend to measure only one narrow skill, like pattern recognition, rather than the full range of abilities that make up a real IQ score.

One well-known research tool, Raven’s Progressive Matrices, does focus specifically on pattern recognition. It presents a 3×3 grid of shapes with one piece missing, and you have to figure out which option completes the pattern by identifying the rules governing each row and column. This measures fluid intelligence, your ability to reason through novel problems. But even Raven’s is designed to be administered under standardized conditions, and it captures only one dimension of intelligence. It’s not a substitute for a full IQ battery.

Who Can Administer the Test

Professional IQ tests are classified as “Level C” psychological instruments, meaning they can only be purchased and administered by people with advanced training in psychological assessment. In practice, this means a licensed psychologist or a supervised trainee working under one. The American Psychological Association’s guidelines are explicit: testing should not be done by unqualified persons, and psychologists should only provide assessment services within the boundaries of their training and experience.

This matters because proper administration requires more than just reading instructions aloud. The examiner needs to know how to establish rapport without influencing answers, how to score ambiguous responses, and how to interpret the pattern of scores in the context of the person’s background, behavior during testing, and the reason for referral. A psychologist might note, for example, that someone’s processing speed was unusually low compared to their reasoning ability, which could point toward specific neurological or attention-related explanations rather than low overall intelligence.

Why Tests Get Updated

IQ tests are re-normed every 15 to 20 years because population performance shifts over time. For most of the 20th century, average scores rose steadily, a trend known as the Flynn Effect. Each new generation scored higher than the last on older tests, likely due to improvements in nutrition, education, and familiarity with abstract thinking.

That trend has largely stalled or reversed in many developed countries in the 21st century. Recent research across 48 countries using international achievement data from 2000 to 2018 found that the most economically advanced nations now show trivial or even negative score changes over time, while developing nations still show gains. This means an older version of a test may produce slightly inflated scores compared to a newer one, which is one reason clinicians are encouraged to use the most current edition available.