Why Are Tests Timed and Who Do Time Limits Hurt?

Tests are timed for a mix of reasons: to measure how quickly you can apply knowledge under pressure, to keep scoring conditions identical for every test-taker, and to make large-scale testing logistically possible. But the rationale shifts depending on the type of test, and over a century of research shows that time limits don’t always measure what test designers think they measure.

Standardization: The Same Conditions for Everyone

The most common justification for timing a test is standardization. When every person taking the SAT, ACT, or a medical licensing exam sits down under the same clock, the scores become directly comparable. If one student got 90 minutes and another got three hours, you couldn’t meaningfully compare their results. A fixed time limit creates a controlled condition, the same way a scientific experiment controls variables so the results mean something.

This logic applies most forcefully to high-stakes admissions and licensing exams. The ACT, for example, gives you 35 minutes for 50 English questions (about 42 seconds each), 50 minutes for 45 math questions (roughly 67 seconds each), and 40 minutes for 36 reading questions (also about 67 seconds each). The USMLE Step 1 medical licensing exam divides its day into seven 60-minute blocks, each containing up to 40 questions. These limits exist partly so that a score earned in one testing center in one city means the same thing as a score earned anywhere else.

Speed Tests vs. Power Tests

Not all timed tests work the same way. Psychometricians, the people who design and study tests, draw a sharp line between two types: speed tests and power tests.

A pure speed test gives you very easy questions and measures how many you can complete in a set time. The challenge isn’t figuring out the answer; it’s working fast enough. Think of a basic multiplication drill where a second-grader has to solve as many single-digit problems as possible in two minutes. As the psychometrician H. Gulliksen defined it back in 1950, a pure speed test is “composed of items so easy that the subjects never give the wrong answer to any of them.”

A pure power test is the opposite. Every question gets attempted, there’s no clock pressure, and the only thing that matters is how many you get right. The questions get progressively harder, and your score reflects the difficulty level you can handle, not how fast you work.

Most real-world tests fall somewhere between these two extremes. The SAT isn’t purely a speed test, but its time constraints are tight enough that many students don’t finish every section comfortably. This means the score reflects some blend of knowledge and speed, and separating those two factors is surprisingly difficult.

Measuring Automaticity

In certain contexts, speed genuinely matters. When elementary school teachers give timed math fact quizzes, the goal isn’t to stress kids out. It’s to check for automaticity: whether a student can recall that 7 × 8 = 56 instantly, without counting on fingers or working through a strategy. Automaticity with basic facts frees up mental bandwidth for harder math later. If a student needs 15 seconds to recall a multiplication fact, they’ll struggle with long division or algebra, where that fact is just one small step in a longer problem.

Timed testing is the most common tool for measuring this kind of fluency. That said, it’s a blunt instrument. Researchers have pointed out that timed drills measure speed and accuracy but miss whether students have developed flexible thinking about numbers. An untimed interview, where a teacher simply notes how long a student pauses before answering, can capture automaticity without the pressure of a countdown.

Practical and Logistical Reasons

Beyond the science, there are straightforward practical reasons tests are timed. Testing centers need to schedule multiple sessions per day. Proctors need defined shifts. Room rentals cost money by the hour. And the longer a test session runs, the greater the opportunity for cheating, whether that means sneaking a glance at a phone or simply having more time to receive outside help.

For a test like the USMLE Step 1, which already runs eight hours in a single day, removing time limits would make administration nearly impossible. Even universities giving a midterm in a lecture hall during a 75-minute class period have no practical option for open-ended timing. The clock is, in part, a constraint of the physical world.

The Case Against Timing

Despite how universal timed testing is, the psychometric evidence against it is surprisingly strong. Measurement experts have argued for over a century that putting time limits on knowledge-based tests introduces “irrelevant variance,” meaning the scores end up reflecting something other than what the test is supposed to measure.

The core problem is that speed and knowledge are genetically and cognitively distinct. A large-scale study published in Nature Communications found that cognitive processing speed and cognitive processing accuracy are fundamentally independent dimensions of brain function, with different genetic architectures and different patterns of brain activity. Accuracy correlated almost perfectly with general intelligence, while speed did not. When a timed test blends the two together into one score, it muddies the picture.

Reliability is another issue. Test reliability means that if you took the same test twice, you’d get a similar score both times. Timed tests appear highly reliable by standard statistical measures, but experts have called this a mirage. The apparent consistency comes from the fact that people tend to work at a steady pace, so the pattern of answered-versus-unanswered questions looks similar across attempts. That consistency reflects work speed, not knowledge. Anne Anastasi, one of the most influential psychologists in testing history, warned that the reliability scores of timed tests “may be completely meaningless” when calculated using standard methods.

There’s also the problem of what happens when time runs short. Research on test “speededness” shows that as the clock winds down, people shift from genuinely working through problems to guessing randomly. This creates a situation where the last portion of a test measures something entirely different from the first portion, biasing both individual scores and the statistical properties of the test items themselves.

Who Gets Hurt by Time Limits

Time limits don’t affect everyone equally. Students with ADHD, learning disabilities like dyslexia, or anxiety disorders may know the material just as well as their peers but process it at a different pace. Under the Americans with Disabilities Act, testing entities are required to offer accommodations like extended time. A student with a documented reading disorder, for instance, may receive double time on standardized exams.

The ADA makes clear that a history of academic success doesn’t disqualify someone from needing accommodations. A student with dyslexia who has earned strong grades may have done so precisely because of the extra time and effort they invest in reading, and a timed test can erase that effort by introducing a barrier that has nothing to do with their actual knowledge.

Common accommodations include time-and-a-half or double time, and they apply across contexts from middle school Section 504 plans to professional licensing exams. The fact that these accommodations exist at all is an implicit acknowledgment that time limits measure something beyond the stated content of the test.

Why Timed Tests Persist

If the evidence against timing is so strong, why do virtually all major exams still use a clock? The answer is partly practical (logistics, cost, security) and partly institutional inertia. Testing organizations have decades of score data built on timed conditions, and changing the format would make new scores incomparable to old ones. There’s also a real-world argument that many jobs require quick decision-making under pressure, so timed tests serve as a rough proxy for performance in fast-paced environments like emergency medicine or air traffic control.

Still, the trend in some areas is toward more generous time limits. The ACT recently shortened its test by removing the science section as a requirement, reducing total testing time to about two hours and five minutes for the core sections. The digital SAT, restructured in 2024, similarly adjusted its pacing. These changes reflect a growing awareness that extremely tight time pressure can distort what a test is actually measuring. The debate isn’t really about whether to time tests at all. It’s about how tight the limit should be before the clock starts measuring speed instead of knowledge.