Preemptive multitasking is a method operating systems use to run multiple programs at the same time by forcibly switching between them. Instead of waiting for a program to voluntarily give up control of the processor, the operating system interrupts it after a tiny window of time (typically 10 to 100 milliseconds) and hands the processor to the next program in line. This is how every modern desktop, laptop, and smartphone keeps dozens of apps running smoothly without any single one freezing the whole system.
How the Operating System Takes Control
At the heart of preemptive multitasking is a component called the scheduler. The scheduler’s job is to decide which program gets to use the processor and for how long. When a program’s time is up, the operating system doesn’t ask politely. It forces the program to pause, saves its current state, and loads a different program to run next. This forced handoff is called a context switch, and it happens so fast (roughly 0.1 to 1 millisecond) that you never notice it.
The mechanism that makes this possible is a hardware timer built into the processor. This timer fires an interrupt signal at regular intervals, like a tiny alarm clock. Each time the alarm goes off, it pauses whatever program is currently running and hands control to the operating system’s interrupt handler. The OS then decides whether to let the current program keep running or swap in a different one. Because these interrupts can arrive at any point in a program’s code, no application can prevent the OS from stepping in.
Why It Replaced Cooperative Multitasking
Before preemptive multitasking became the norm, most consumer operating systems used cooperative multitasking. In that model, each program was responsible for periodically yielding control back to the OS on its own. If a program was poorly written, or simply busy with a long calculation, it could hog the processor indefinitely. One misbehaving app could freeze your entire computer.
Preemptive multitasking solved this by putting the operating system in charge instead of trusting individual programs. The OS allocates processor time dynamically, so even if one application is doing heavy work, other apps still get their turn. That’s why your music keeps playing while a browser tab loads a complex page. The difference boils down to control: cooperative multitasking depends on every program playing nice, while preemptive multitasking enforces fairness whether programs cooperate or not.
When It Became the Standard
Preemptive multitasking is older than most people realize. It was implemented in mainframe and research systems as early as 1964, including Multics and MIT’s Compatible Time-Sharing System. Unix adopted it in 1969, and it has been a core feature of every Unix-like system since, including Linux, macOS, and the BSD family.
For home users, the timeline was slower. Microware’s OS-9, available for the TRS-80 Color Computer in the early 1980s, was likely the first preemptive multitasking OS aimed at consumers. The Commodore Amiga followed in 1985, combining preemptive multitasking with multimedia in a way that felt years ahead of its time. Microsoft brought preemptive multitasking to the mainstream with Windows NT 3.1 in 1993 and then Windows 95 in 1995, which preemptively multitasked 32-bit applications. Apple completed the transition in 2001 when it replaced the cooperative classic Mac OS with Mac OS X, built on technology from NeXTSTEP.
How Linux Decides What Runs Next
Linux uses a scheduler called CFS, or Completely Fair Scheduler, which tries to model an ideal processor that could run every program simultaneously. It does this by tracking a value for each running program called “virtual runtime,” which represents how much processor time that program has consumed relative to all other programs waiting to run. The scheduler always picks the program with the lowest virtual runtime, meaning the one that has received the least attention so far.
CFS organizes all runnable programs in a sorted tree structure, ordered by virtual runtime. Each time a program runs for a bit, its virtual runtime increases. Once it climbs high enough that another program now has the lowest value, the scheduler preempts the current program and switches to the new one. This approach tracks time at nanosecond precision, so it can distribute processor access very evenly across programs. Programs can also be assigned different weights (through a priority system), so higher-priority tasks accumulate virtual runtime more slowly and get more total processor time.
The Cost of Switching
Every context switch has a small cost. The OS has to save the current program’s state (its position in the code, the values it was working with, its place in memory) and then load all of that information for the next program. This takes somewhere between 0.1 and 1 millisecond on modern hardware. That sounds trivial, but when the system is switching thousands of times per second, overhead adds up.
There’s also an invisible cost: when a new program starts running, the processor’s fast-access memory caches are still full of data from the previous program. The new program has to wait while its own data gets loaded into the cache, which briefly slows it down. This is one reason schedulers don’t switch too aggressively. A typical time slice of 10 to 100 milliseconds is a balance between keeping the system responsive (shorter slices) and minimizing switching overhead (longer slices). Linux even allows time slices up to 3,200 milliseconds for certain workloads where throughput matters more than responsiveness.
Preemption in Real-Time Systems
General-purpose operating systems like Windows and Linux use preemptive multitasking primarily for fairness: making sure every program gets a reasonable share of processor time so the system feels smooth. Real-time operating systems (RTOS), used in things like medical devices, industrial robots, and aircraft control systems, use preemption for a completely different reason: meeting deadlines.
In a real-time system, fairness is not the goal. Each task is assigned a fixed priority, and the scheduler always runs the highest-priority task that’s ready. A lower-priority task will be preempted the instant a higher-priority task needs the processor, regardless of how long it has been running. This guarantees that time-critical operations (reading a sensor, firing a brake actuator) happen within strict time limits. The tradeoff is that low-priority tasks might get very little processor time if high-priority tasks are busy, which would feel terrible on a desktop but is exactly the right behavior when a missed deadline could mean a system failure.

