Contiguous memory is a block of memory made up of sequentially adjacent addresses, forming a single unbroken region. Think of it like a row of numbered mailboxes sitting side by side with no gaps: mailbox 100, 101, 102, 103, and so on. This arrangement is fundamental to how computers store and retrieve data efficiently, and it shows up everywhere from the arrays you use in programming to the way hardware devices move data around.
Why Contiguous Layout Matters
When data sits in a continuous, unbroken stretch of memory, the computer can find any piece of it almost instantly. Arrays are the classic example. Because every element is stored at a predictable offset from the starting address, accessing the 500th element takes the same amount of time as accessing the 1st. That’s constant-time access, or O(1) in computer science shorthand. A linked list, by contrast, scatters its elements across memory and chains them together with pointers. To reach the 500th element, the processor has to hop through all 499 elements before it, making access time proportional to the list’s length.
Contiguous memory also plays well with how modern CPUs cache data. When the processor needs a value that isn’t already in its fast local cache, it doesn’t fetch just that single value. It grabs an entire “cache line,” a chunk of nearby memory. If your data is laid out contiguously, those neighboring bytes are likely to be the exact values you need next. This property is called spatial locality: a reference to one memory location strongly predicts that neighboring locations will be referenced soon. The result is fewer cache misses and noticeably faster execution, especially in loops that iterate over large datasets.
How the Operating System Allocates Contiguous Memory
When a program requests a contiguous block, the operating system needs to find a free region large enough to satisfy the request. Two common strategies handle this search differently.
- First-fit: The allocator walks through its list of free blocks and picks the first one that’s big enough. It’s fast because it stops searching as soon as it finds a match, but it can leave awkward leftover gaps near the beginning of memory.
- Best-fit: The allocator checks every free block and picks the one whose size most closely matches the request, leaving the smallest possible leftover. This sounds ideal, but scanning every block takes longer, and the tiny leftover fragments it creates can be too small to ever use.
Neither strategy is universally better. First-fit tends to be faster in practice, while best-fit minimizes wasted space per allocation at the cost of more fragmentation over time.
The Fragmentation Problem
Contiguous allocation’s biggest weakness is external fragmentation. As programs allocate and free memory over time, the free space gets broken into scattered gaps. You might have 200 MB of free memory total, but if it’s split into dozens of small, non-adjacent chunks, a request for a single 100 MB contiguous block will fail. The memory is there, just not in one piece.
One fix is compaction: the operating system pauses running processes and shuffles their memory blocks together, closing the gaps. This works but is expensive. Stopping processes mid-execution isn’t always feasible, especially on systems that need to respond in real time.
Modern operating systems largely sidestep this problem by using paging. Instead of giving each process one big contiguous region of physical memory, the system divides memory into fixed-size pages (typically 4 KB). A process’s memory can be scattered across physical RAM in any order, and a hardware translation table maps those scattered pages into what looks like a contiguous address space from the process’s perspective. Fixed-size pages eliminate external fragmentation entirely, though they introduce a small amount of internal fragmentation: if a process needs 5.1 KB, it gets two 4 KB pages, wasting nearly 3 KB inside the second page.
When Hardware Still Demands Contiguous Memory
Even in a paged system, certain hardware components need physically contiguous memory. Direct Memory Access (DMA) controllers are the most common example. DMA lets a device transfer data directly to or from RAM without involving the CPU on every byte. Many DMA controllers expect a single unbroken stretch of physical addresses to write into, because they don’t go through the system’s memory management unit (the chip that translates virtual addresses to physical ones).
Multimedia hardware like GPUs and video processing units often need large contiguous buffers, sometimes hundreds of megabytes, to hold frame data. These devices frequently bypass the translation hardware entirely, so the memory genuinely has to be adjacent at the physical level, not just virtually.
Some newer DMA controllers support a technique called scatter-gather, which lets them accept a list of separate memory chunks and treat them as one logical buffer. This relaxes the contiguity requirement, but only when the device actually routes its memory accesses through a translation unit. Many embedded and multimedia devices don’t.
How Linux Handles It With CMA
The Linux kernel includes a feature called the Contiguous Memory Allocator (CMA) specifically to solve the problem of getting large contiguous blocks in a paged system. At boot time, CMA reserves a big physically contiguous region of RAM. When a driver or device needs a contiguous buffer, CMA carves it out of this reserved area. When no device needs it, the kernel’s normal memory allocator can use that same region for regular movable allocations, so the memory isn’t wasted while sitting idle.
System administrators can configure the size of the CMA region through kernel command-line arguments, and developers can create custom CMA heaps for specific devices. On platforms like Qualcomm’s Linux distribution, CMA regions need to be aligned to 4 MB boundaries to support page migration. The system exposes these heaps as device files, making it straightforward for user-space programs and drivers to request contiguous buffers when they need them.
Contiguous vs. Non-Contiguous Memory in Practice
For everyday programming, contiguous memory means choosing the right data structure. Arrays, vectors, and similar flat structures store elements side by side and benefit from cache-friendly access patterns. Linked lists, trees, and hash tables scatter nodes across memory, trading spatial locality for flexibility in insertions and deletions. If your workload involves iterating over large collections, contiguous structures will almost always be faster. If your workload involves frequent insertions in the middle of a collection, the overhead of shifting elements in a contiguous block may make a non-contiguous structure the better choice.
At the operating system level, the trade-off is between simplicity and flexibility. Contiguous allocation is conceptually simple and fast for hardware access, but it fragments over time and struggles with dynamic workloads. Paged, non-contiguous allocation adds a layer of translation complexity but handles real-world memory demands far more gracefully. Modern systems use both: paging for general-purpose memory management, with targeted contiguous allocation (through mechanisms like CMA) for the specific hardware that requires it.

