What is consistent hashing?

Consistent hashing is a technique that maps both keys and servers onto the same circular hash space, so each key belongs to the first server found clockwise from its position. When a server joins or leaves, only the keys in that one segment of the ring move. With mod-N hashing, almost every key would move instead.

How many keys move when a node is added or removed?

With consistent hashing, adding or removing one node out of N moves about K/N of the K total keys, the theoretical minimum. With hash mod N, growing from 10 to 11 servers remaps roughly 91% of all keys, because nearly every key gets a new modulo result.

What are virtual nodes in consistent hashing?

Virtual nodes place each physical server at many positions on the ring instead of one. This evens out the random gaps between servers and lets bigger machines take proportionally more positions. Libketama uses 160 points per server, and Cassandra defaults to 16 tokens per node since version 4.0.

Does consistent hashing fix hot keys?

No. Consistent hashing balances the key space, not key popularity. If one key gets 100x the traffic, its owner still takes all of that load. You need a different fix for that, such as replicating hot keys across several nodes or the bounded-loads variant of consistent hashing.

Consistent Hashing

Add an 11th server to a 10-server memcached pool that uses hash(key) % N, and roughly 91% of your keys suddenly live on the wrong server. Your cache hit rate falls off a cliff, and the database behind it takes the full read load until the cache refills. Consistent hashing exists to make that number ~9% instead.

This note covers how the ring works, why virtual nodes exist, what the real alternatives are, and the cases where consistent hashing is the wrong tool.

What problem does consistent hashing solve?

Consistent hashing is a way to assign keys to servers so that adding or removing one server out of N only moves about K/N of the K total keys, instead of almost all of them. The technique comes from a 1997 MIT paper by Karger and others, originally built for web caching at what became Akamai.

The naive approach is hash(key) % N. It distributes keys evenly, but N is baked into every placement decision. Change N and nearly every key gets a different modulo result. Going from 10 to 11 servers remaps about 10/11 of all keys, roughly 91%. For a cache, that's close to a full flush. For a sharded datastore, it's a massive rebalancing storm.

Consistent hashing removes N from the placement math. Keys and servers hash into the same fixed space, and membership changes only disturb the neighborhood of the server that changed.

How does the hash ring actually work?

The hash ring works by mapping both servers and keys onto a circle of hash values, where each key belongs to the first server sitting clockwise from the key's position.

Concretely:

Treat the output range of a hash function (say 0 to 2³² - 1) as a circle, so the maximum value wraps around to 0.
Hash each server's identifier (like 10.0.1.7:6379) onto the circle.
Hash each key onto the same circle.
To find a key's owner, walk clockwise from the key's position. The first server you hit owns the key.

When a server leaves, only the keys it owned move, and they move to the next server clockwise. Every other assignment is untouched. When a server joins, it takes over one slice of the ring from its clockwise neighbor and nothing else changes. That locality is the entire trick.

In code, the ring is a sorted map from hash value to node, and lookup is a binary search. Here's the core of it in Java:

public final class HashRing<N> {
  private final TreeMap<Long, N> ring = new TreeMap<>();
  private final int pointsPerNode;
 
  public HashRing(int pointsPerNode) {
    this.pointsPerNode = pointsPerNode;
  }
 
  public void addNode(N node) {
    for (int i = 0; i < pointsPerNode; i++) {
      ring.put(hash(node + "#" + i), node);
    }
  }
 
  public void removeNode(N node) {
    for (int i = 0; i < pointsPerNode; i++) {
      ring.remove(hash(node + "#" + i));
    }
  }
 
  public N nodeFor(String key) {
    Map.Entry<Long, N> entry = ring.ceilingEntry(hash(key));
    // Past the highest point: wrap around to the start of the ring.
    return (entry != null ? entry : ring.firstEntry()).getValue();
  }
}

Use a real hash function like MurmurHash3 or MD5 here, never Object.hashCode(). You need uniform spread across the whole range, and you need every client to compute identical values. Lookup cost is O(log(N x V)) for N nodes with V points each, which is microseconds at any realistic cluster size.

Why do you need virtual nodes?

Virtual nodes exist because hashing each server to a single point leaves randomly sized gaps on the ring, and the unlucky server behind a big gap owns far more than its fair share of keys. With a handful of nodes at one point each, it's normal for one node to carry several times the load of another.

The fix is to place each physical server at many positions. Libketama, the classic memcached client library, uses 160 points per server. Amazon's Dynamo paper built its whole partitioning scheme on virtual nodes. A few hundred points per server brings the load imbalance down to roughly ±5 to 10 percent, which is usually good enough.

Virtual nodes buy you two more things:

Weighted servers. Give a machine with double the RAM double the points, and it owns about double the key space.
Smoother failure recovery. When a server dies, its key ranges are scattered around the ring, so the load it drops spreads across many survivors instead of crushing one clockwise neighbor.

The cost is a bigger ring (more memory, slightly slower rebuilds) and more bookkeeping during rebalancing. Cassandra shipped with 256 tokens per node for years and dropped the default to 16 in version 4.0 because very high token counts made repairs and streaming slower without improving balance much.

When should you not use consistent hashing?

Skip consistent hashing when your node set is fixed, when a coordinator already tracks placement, or when your real problem is key popularity rather than key placement.

Fixed N. A hard-partitioned system that never resizes online (say, 16 Kafka-style partitions you only change with a planned migration) gets nothing from the ring. hash % N is simpler and perfectly even.
A central directory exists. Systems with an explicit placement service or metadata layer (HDFS NameNode, anything with a shard map in etcd or ZooKeeper) can place data wherever they want and just record it. The ring's value is letting every client compute placement with zero coordination.
Hot keys. Consistent hashing balances the key space, not traffic. One viral key still lands on one owner. Fixes live elsewhere: replicate hot keys, add a request-coalescing layer, or use consistent hashing with bounded loads (a 2016 Google research result, available in HAProxy as hash-balance-factor), which caps any node at a configurable multiple of average load and spills overflow to the next node.
You only need bucket numbers. Jump consistent hash (Lamping and Veach, Google, 2014) maps a key to one of N numbered buckets in O(1) time with zero memory, and moves the minimum number of keys on resize. Its limits: buckets are numbers, not arbitrary servers, and you can only add or remove at the end of the range. Great for sharded storage with a directory in front, wrong for ad-hoc server churn.
Few servers per key decision. Rendezvous (highest-random-weight) hashing from the late 1990s scores every node per key and picks the max. It needs no ring state at all and balances slightly better, but lookup is O(N) per key, so it suits small-to-medium N.

Approach	Keys moved on resize	Lookup	Memory	Best fit
Hash mod N	~(N-1)/N of all keys	O(1)	none	Fixed N, offline resharding acceptable
Ring + vnodes	~K/N (minimum)	O(log(N x V))	O(N x V)	Dynamic membership, weighted nodes
Rendezvous (HRW)	~K/N (minimum)	O(N)	O(N)	Small clusters, no ring state wanted
Jump consistent	~K/N (minimum)	O(1)	none	Numbered buckets, grow/shrink at the end

How do real systems use consistent hashing?

Most large key-value and routing layers you've used run some form of it.

Amazon Dynamo / DynamoDB partitions data with consistent hashing and virtual nodes. The 2007 Dynamo paper is still the best production writeup of the trade-offs.
Apache Cassandra hashes partition keys with Murmur3 onto a token ring. Token ranges per node are exactly the virtual-node idea.
Memcached clients (libmemcached, spymemcached) ship ketama consistent hashing so a pool resize doesn't flush the cache.
Envoy offers ring hash and Maglev as load-balancing policies for sticky routing to backends.
Discord routes guild traffic to servers through a consistent hash ring and open-sourced their ex_hash_ring library.

A pattern worth noticing: databases pair the ring with replication (each key written to the owner plus the next R-1 nodes clockwise), and load balancers pair it with health checks (a dead backend just vanishes from the ring). The ring is the placement primitive, and everything else stacks on top.

For the primary sources, see the original Karger et al. consistent hashing paper, the Amazon Dynamo paper, the jump consistent hash paper, and consistent hashing with bounded loads.

Keep Reading

The One System Design Question That Failed 80% of Candidates in 2025. Sharding and placement decisions like this one are exactly where that interview question goes wrong.
How to Implement the Debezium Outbox Pattern in Spring Boot. Once data is partitioned across services, reliable cross-service events become the next problem.