What Is Consistent Hashing and How Does It Work?

Consistent hashing is a technique for distributing data across multiple servers so that adding or removing a server only requires moving a small fraction of the data, rather than reshuffling everything. It was originally developed in the late 1990s to solve web caching problems and has since become a foundational building block in distributed systems like DynamoDB, Cassandra, and content delivery networks.

The Problem It Solves

The simplest way to spread data across servers is modular hashing: take a key, hash it to a number, and use modulo to pick a server. If you have 4 servers, key hash 17 maps to server 1 (17 mod 4 = 1). This works fine until you add or remove a server. Change from 4 servers to 5, and suddenly almost every key maps to a different server. In a cache, that means nearly all cached data becomes invalid at once. In a database, it means massive data migration.

Consistent hashing fixes this by ensuring that when the number of servers changes, only about 1/n of the keys need to move (where n is the number of servers). The rest stay exactly where they are.

How the Hash Ring Works

Picture a circle (often called a “ring”) representing the full range of possible hash values, from 0 back around to 0. Both servers and data keys get hashed onto positions on this ring using the same hash function. To figure out which server owns a particular key, you start at the key’s position on the ring and walk clockwise until you hit a server. That server is responsible for that key.

When a new server joins, it lands at a position on the ring and takes responsibility for the keys between it and the next server counterclockwise. Only those keys move. Every other key stays with its current server. When a server leaves, its keys simply shift clockwise to the next server on the ring. No global reshuffling happens in either case.

Mapping a key to its server requires searching through the sorted list of server positions on the ring, which takes O(log n) time, where n is the number of servers. That’s fast enough for most real-world systems.

Why Virtual Nodes Matter

Basic consistent hashing has a practical problem: with only a few servers on the ring, the gaps between them are uneven. One server might end up responsible for a huge arc of the ring while another covers a tiny sliver. The result is lopsided load distribution.

Virtual nodes solve this by giving each physical server many positions on the ring instead of just one. Instead of placing “Server A” at one point, you place “Server A, copy 1” through “Server A, copy 100” at 100 different positions spread around the ring. Each physical server owns many small, non-contiguous ranges rather than one large contiguous range. The more virtual nodes per server, the more evenly the load spreads out.

The numbers tell the story clearly. With 100 virtual nodes per server, the standard deviation of load across servers is about 10%, and individual servers can end up handling anywhere from 76% to 128% of their fair share at the 99th percentile. That’s a wide spread for capacity planning. Bump it up to 1,000 virtual nodes per server, and the standard deviation drops to roughly 3.2%, with servers handling between 92% and 109% of their fair share. The tradeoff is memory: more virtual nodes means more positions to store and search through.

Two-Step Mapping in Practice

Some production systems use a variation where the ring isn’t mapped directly to physical servers. Instead, the hash space is divided into a fixed, large number of logical buckets (tens of thousands, for example) that are then explicitly mapped to a much smaller number of physical servers through a lookup table. Finding a key’s server becomes a two-step process: first, hash the key to find its logical bucket using simple modular arithmetic, then look up which physical server owns that bucket.

This approach makes rebalancing more controllable. When you add a physical server, you reassign specific logical buckets to it by updating the mapping table. You decide exactly which buckets move and when, rather than relying on where a hash function happens to place things on a ring. AWS describes this pattern in its cell-based architecture guidance, and it’s a common approach in systems that need fine-grained control over data placement.

Where Consistent Hashing Is Used

Amazon’s DynamoDB uses partition key hashing to determine which partition stores a given item. When you write an item, DynamoDB feeds the partition key into an internal hash function, and the output determines the target partition. As tables grow, DynamoDB automatically splits partitions when they fill to capacity or when throughput demands exceed what existing partitions can handle. This partition management happens in the background with no action required from the application.

Apache Cassandra uses virtual nodes as a core part of its architecture. Each node in a Cassandra cluster owns a large number of small partition ranges distributed throughout the ring, and the placement of each row is determined by hashing its partition key. This design simplifies operations because you don’t need to manually calculate and assign token ranges when adding nodes to the cluster.

Content delivery networks were actually the original motivation. The technique grew out of research at MIT in the late 1990s, where Daniel Lewin’s thesis explored consistent hashing and random trees as algorithms for caching in distributed networks. The core insight was that web caches needed a way to agree on which cache server should hold a given URL without requiring every cache to communicate with every other cache, and without breaking down when caches joined or left the network. That same insight applies to virtually every distributed system built since.

Tradeoffs and Limitations

Consistent hashing minimizes data movement, but it doesn’t eliminate it. Adding a server still requires migrating data, and during that migration, the system needs to handle requests for keys that are in transit. Most production systems solve this with temporary forwarding or read-from-both strategies, but it adds complexity.

Load balancing is never perfectly even, even with virtual nodes. You’re always choosing a point on the spectrum between “fewer virtual nodes, less memory, more imbalance” and “more virtual nodes, more memory, better balance.” For systems with strict latency requirements, even a 10% load imbalance can be the difference between meeting and missing targets on your busiest servers.

Standard consistent hashing implementations also require updating their internal data structures whenever nodes are added or removed. The ring’s sorted list of positions needs to be modified, which temporarily affects system availability during reconfiguration. Newer algorithms have been proposed that avoid dynamic data structures entirely, trading different properties to eliminate that reconfiguration cost.