Java ConcurrentHashMap implementation
java.util.concurrent.ConcurrentHashMap is a highly concurrent, thread-safe hash table implementation in the Java Collections Framework. It allows full concurrency of retrievals and high expected concurrency for updates. This document details its internal architecture, node hierarchy, concurrency design, multi-threaded cooperative resizing mechanism, and size-counting algorithms based on the OpenJDK source code.
1. Architecture and Internal Data Structures
ConcurrentHashMap is structured as a binned hash table where each bin contains a singly-linked list of nodes or a balanced Red-Black Tree. It uses volatile variables, Compare-And-Swap (CAS) operations, and fine-grained monitor locking to achieve thread safety.
1.1 Node Class Hierarchy
All entries in the table are subclasses of the base Node<K,V> class. Unlike HashMap which uses a deeply nested inheritance model via LinkedHashMap, ConcurrentHashMap maintains a flat node hierarchy where specialized nodes inherit directly from Node<K,V>.
@startuml
interface "Map.Entry<K, V>" as Entry
class "ConcurrentHashMap.Node<K, V>" as Node implements Entry {
~ int hash
~ K key
~ volatile V val
~ volatile Node<K, V> next
}
class "ConcurrentHashMap.TreeNode<K, V>" as TreeNode extends Node {
~ TreeNode<K, V> parent
~ TreeNode<K, V> left
~ TreeNode<K, V> right
~ TreeNode<K, V> prev
~ boolean red
}
class "ConcurrentHashMap.TreeBin<K, V>" as TreeBin extends Node {
~ TreeNode<K, V> root
~ volatile TreeNode<K, V> first
~ volatile Thread waiter
~ volatile int lockState
}
class "ConcurrentHashMap.ForwardingNode<K, V>" as ForwardingNode extends Node {
~ Node<K, V>[] nextTable
}
class "ConcurrentHashMap.ReservationNode<K, V>" as ReservationNode extends Node {
}
class "ConcurrentHashMap<K, V>" as CHM {
- volatile Node<K,V>[] table
- volatile Node<K,V>[] nextTable
- volatile long baseCount
- volatile int sizeCtl
- volatile int transferIndex
- volatile CounterCell[] counterCells
}
CHM o-- Node : contains table of
@enduml
Node Types and Role Encodings
Specialized nodes are identified by negative values in their hash field, which are reserved for control purposes:
Node<K,V>: The standard node containing user key-value pairs (hash $\ge 0$).ForwardingNode<K,V>: Placed at the head of a bin during resizing (hash =MOVED=-1). It contains a reference to the new table array (nextTable).TreeBin<K,V>: Serves as the bin head and holds the root of a Red-Black Tree ofTreeNodes (hash =TREEBIN=-2).ReservationNode<K,V>: A transient placeholder node used during compute operations likecomputeIfAbsentto reserve the slot before the value is established (hash =RESERVED=-3).
1.2 Memory Layout and Bin Representation
The following diagram illustrates the structural layout of a ConcurrentHashMap with various bin states:
@startuml
skinparam handwritten false
skinparam monochrome false
object "ConcurrentHashMap" as chm {
sizeCtl = 12
baseCount = 5
}
map "table (Node<K,V>[])" as table {
0 => Node 1
1 => ForwardingNode
2 => TreeBin
3 => null
... => ...
}
chm --> table
package "Bin 0: Singly-Linked List" {
object "Node 1" as n1 {
hash = 96
key = "KeyA"
val = "ValA"
}
object "Node 2" as n2 {
hash = 112
key = "KeyB"
val = "ValB"
next = null
}
n1 --> n2 : next
}
package "Bin 1: Forwarded Slot (Resizing)" {
object "ForwardingNode" as fwd {
hash = -1 (MOVED)
key = null
val = null
nextTable = Node[] (New Table)
}
}
package "Bin 2: Treeified Slot (TreeBin & TreeNodes)" {
object "TreeBin (Bin Head)" as tb {
hash = -2 (TREEBIN)
root = TreeNode (Root)
first = TreeNode (Head)
}
object "TreeNode (Root)" as tr {
red = false
key = "KeyC"
}
object "TreeNode (Left)" as tl {
red = true
key = "KeyD"
}
tb --> tr : root
tr --> tl : left
}
table::0 --> n1
table::1 --> fwd
table::2 --> tb
@enduml
2. Concurrency Design
ConcurrentHashMap achieves high concurrency by avoiding table-wide locks. It separates read operations from write operations and applies locking at the individual bin level.
2.1 Lock-Free Read Operations (get)
Retrieval operations do not block and run completely lock-free.
- The
tablearray reference and thenextandvalfields of everyNodeare declaredvolatile. This guarantees that updates made by one thread are immediately visible to reader threads. - If a reader thread encounters a
ForwardingNode(hash =-1), the read is redirected to the new table array referenced by the node, ensuring seamless reads even during active resizing. - If a reader thread accesses a treeified bin, it can traverse the bin using the sequential
nextpointers ofTreeNodewithout needing to acquire a tree-lock.
2.2 Fine-Grained Write Operations (putVal)
Write operations (put, remove, replace) use a combination of CAS and localized monitor synchronization:
- CAS for Empty Bins:
- If the target bin index is empty (
tabAt(tab, i) == null), the thread attempts to insert the new node using a Compare-And-Swap (CAS) operation:if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value))) break; // Node inserted without acquiring a lock - If the CAS succeeds, no lock is acquired, eliminating synchronization overhead for empty slots.
- If the target bin index is empty (
- Synchronized Bin Heads:
- If the bin is not empty, the thread locks the first node of the bin (
f) using Java’s built-in monitor synchronization:synchronized (f) { if (tabAt(tab, i) == f) { // Double-check to ensure head node hasn't changed // Perform insertion, update, or treeification } } - Only threads writing to the same bucket will contend for the lock. Threads writing to different buckets execute in parallel.
- The double-check
tabAt(tab, i) == fensures that the head node has not been removed, treeified, or relocated by a concurrent resize before the lock was acquired.
- If the bin is not empty, the thread locks the first node of the bin (
2.3 Prohibition of Null Keys and Values
Unlike HashMap, ConcurrentHashMap does not permit null keys or values.
In a concurrent map, allowing null values introduces an ambiguity: map.get(key) -> null could mean that the key maps to null, or that the key is not present in the map. In a single-threaded map, this is resolved using map.containsKey(key). In a concurrent map, the state of the map can change between the calls to get() and containsKey(), creating race conditions. To prevent this ambiguity, null keys and values are strictly prohibited.
2.4 TreeBin LockState and Non-Blocking Reads
When a bucket is treeified, the bin head is replaced by a TreeBin node, which acts as the root of a balanced Red-Black Tree. To coordinate concurrent reads and writes on the tree without blocking, TreeBin maintains an internal lockState field:
WRITER = 1: Set when a thread is writing to or restructuring the tree (holding a write lock).WAITER = 2: Set when a thread is waiting to acquire the write lock.READER = 4: Used as a unit of increment/decrement to track active reader threads.
1. Reader Counter Tracking Mechanics
When a thread calls find(h, k) on a TreeBin:
- Acquisition: It checks if there are active writers or waiters (
(lockState & (WAITER|WRITER)) != 0). If the tree is stable, the thread attempts to increment the reader count byREADERusing CAS:If successful, it traverses the Red-Black Tree usingU.compareAndSetInt(this, LOCKSTATE, s, s + READER)findTreeNode(h, k). - Release: Once the read is complete, the thread decrements the reader count in the
finallyblock using an atomic add:If this thread was the last reader and a writer thread is waiting (U.getAndAddInt(this, LOCKSTATE, -READER)lockStatebecomesWAITER), it callsLockSupport.unpark(waiter)to wake up the waiting writer.
2. Rationale for Thread-Safe Concurrent Reads
This atomic increment and decrement of the reader count does not cause problems or bottlenecks in concurrent multiple reads:
- No Mutual Reader Blockage: Since
READERis represented by bit-fields aboveWRITERandWAITER(value 4, or0b100), multiple reader threads can increment the lock state concurrently without interfering with each other. If five threads read at the same time,lockStatesimply becomes20(5 * READER). They all traverse the tree in parallel. - Optimistic CAS Loop: The acquisition uses a CAS spin-loop. If multiple threads attempt to increment
lockStateconcurrently, the ones that fail the CAS will simply loop, read the updated state, and retry immediately. Since read registration is a simple, fast register update, spin contention is extremely short-lived. - Atomic Release: The release phase uses
U.getAndAddInt, which translates to a hardware-level atomic instruction (such asLOCK XADDon x86). This instruction never fails and is guaranteed by the hardware to update the count thread-safely without locks. - Zero-Blocking Fallback (Singly-Linked List Traversal): If a reader thread detects that a writer is active or waiting (
(lockState & (WAITER|WRITER)) != 0), or if it repeatedly fails to acquire the read lock due to extreme contention, it does not block or spin. Instead, it immediately bypasses the tree structure and traverses the bin using the sequential singly-linked list (nextpointers) maintained by theTreeNodeinstances. This fallback guarantees that reader threads never block under any circumstance, preserving the non-blocking read contract ofConcurrentHashMap.
2.5 The Single-Writer Constraint and the Waiter Field
A key observation of the TreeBin class is that the waiter field is a single volatile Thread reference, rather than a queue or list of waiting threads. This implies that at most one thread can ever be waiting to acquire the tree write lock at any given time.
This design is correct because of a fundamental architectural constraint in ConcurrentHashMap: the Single-Writer Constraint.
1. Rationale for a Single Waiter
Although ConcurrentHashMap allows lock-free reads, all write operations (such as insertion, deletion, and treeification) are serialized at the bin level using monitor synchronization.
Before any thread can call putTreeVal or removeTreeNode to write to a TreeBin, it must first acquire the Java monitor lock on the TreeBin instance (f) itself:
synchronized (f) {
if (tabAt(tab, i) == f) {
// Delegate to putTreeVal or removeTreeNode
}
}
Because of this outer synchronized block:
- At most one writer thread can ever execute code inside a specific
TreeBinat any time. - Consequently, there is never any contention between multiple writers trying to write to the same
TreeBinsimultaneously. - The only contention a writer thread faces is with active reader threads that are currently traversing the tree.
2. The Locking Flow for the Single Writer
When the single writer thread enters putTreeVal or removeTreeNode, it must restructure the tree. To ensure reader threads do not read an unstable, partially restructured tree, the writer must wait for all active readers to exit.
- The writer calls
lockRoot():- It attempts to CAS
lockStatefrom0toWRITER(1). - If there are active readers,
lockStateis greater than 0 (a multiple ofREADER= 4). The CAS fails.
- It attempts to CAS
- The writer delegates to
contendedLock():- Since this is the only writer thread allowed in the
TreeBinblock, it is the only thread that will ever executecontendedLock(). - It sets the
WAITERbit (2) inlockStateand stores its own thread reference in thewaiterfield:U.compareAndSetReference(this, WAITERTHREAD, null, current); - It then parks itself using
LockSupport.park(this).
- Since this is the only writer thread allowed in the
- Waking the Writer:
- As reader threads finish their traversal, they decrement the reader count in
lockState. - The last reader thread to exit detects that the
WAITERbit is set and wakes up the single waiting writer:if (U.getAndAddInt(this, LOCKSTATE, -READER) == (READER|WAITER) && (w = waiter) != null) LockSupport.unpark(w);
- As reader threads finish their traversal, they decrement the reader count in
Because the outer monitor lock (synchronized(f)) guarantees that there is at most one writer thread per bin, it is physically impossible for multiple threads to wait in contendedLock() at the same time. A single waiter reference is therefore completely sufficient, and no waiting queue is required.
3. Multi-Threaded Cooperative Resizing
Resizing in ConcurrentHashMap is a cooperative, multi-threaded operation. When a thread detects that the map is resizing (by encountering a ForwardingNode), it assists in the transfer of elements rather than stalling.
3.1 Coordination Constants and Fields
sizeCtl: A multi-use control field:-1: Represents active table initialization.- Less than
-1: Represents active resizing. The value encodes a generation stamp in the higher 16 bits and the number of active helper threads in the lower 16 bits. - Positive value: Represents the next resize threshold (capacity $\times$ load factor).
transferIndex: Tracks the next block of bins to be migrated, starting from the old table capacityndown to0.
3.2 Cooperative Migration (Stride)
To avoid contention among threads helping with a resize, the migration work is divided into chunks called strides.
Old Table:
[ Bin N-1 ] ... [ Bin 0 ]
◄─────────────────────── (transferIndex moves from N down to 0)
Thread A: Claims [Stride A] | Thread B: Claims [Stride B]
- Claiming a Stride:
- A thread claims a block of bins (with a minimum size of
MIN_TRANSFER_STRIDE = 16) by performing a CAS ontransferIndex:U.compareAndSetInt(this, TRANSFERINDEX, nextIndex, nextBound = (nextIndex > stride ? nextIndex - stride : 0)) - Once claimed, the thread is solely responsible for migrating bins in the range
[nextBound, nextIndex - 1].
- A thread claims a block of bins (with a minimum size of
- Transfer Mechanics:
- For each index
iin its stride:- If the bin is empty, the thread CASes a
ForwardingNode(fwd) into the slot. This blocks new writes at this index; any thread attempting to write will seefwdand join the resizing effort. - If the bin contains elements, the thread locks the bin head (
synchronized(f)) and splits it into low-index (ln) and high-index (hn) lists based on the bitwise check(hash & n) == 0.
- If the bin is empty, the thread CASes a
- For each index
- The “Last Run” Optimization:
- In a linked list bin, many nodes at the tail of the list may share the same destination index.
ConcurrentHashMaptraverses the list to find thelastRunnode—the node after which all subsequent nodes have the identical destination index.- The thread reuses the
lastRunnode and its descendants directly, avoiding node recreation. On average, only one-sixth of the nodes in a bin require allocation during doubling.
- Completion:
- Once a bin’s elements are written to the new table, the old slot is marked with the
ForwardingNode. - When a thread finishes its stride, it decrements the helper count in
sizeCtl. The last thread to finish the migration performs a final sweep and commits the new table.
- Once a bin’s elements are written to the new table, the old slot is marked with the
4. High-Throughput Concurrent Size Counter
In a highly concurrent environment, a single global counter (like a volatile variable updated via CAS) becomes a performance bottleneck due to thread contention. ConcurrentHashMap implements a high-throughput counter based on the design of java.util.concurrent.atomic.LongAdder.
4.1 Counter Representation
The counter consists of two components:
baseCount: A volatile long updated via CAS when there is no contention.counterCells: A volatile array ofCounterCellobjects, where each cell contains a volatilevaluefield. This array is lazily allocated and grown as contention increases.
[ Thread A ] [ Thread B ] [ Thread C ]
│ │ │
(CAS Success) (CAS Failure) (CAS Failure)
▼ │ │
[ baseCount ] ▼ ▼
[ Cell 0 ] [ Cell 1 ] (inside counterCells)
4.2 Size Upate Algorithm (addCount)
- Uncontended Update:
- The thread attempts to update the global
baseCountusing CAS:U.compareAndSetLong(this, BASECOUNT, b = baseCount, s = b + x) - If this succeeds, the operation completes.
- The thread attempts to update the global
- Contended Update (Fallback):
- If
counterCellsis already allocated, or if thebaseCountCAS fails due to contention:- The thread maps to a specific index in the
counterCellsarray using a thread-local random probe:index = ThreadLocalRandom.getProbe() & (counterCells.length - 1) - It attempts to CAS the value of the cell at that index.
- If the cell is null, or if the cell CAS fails, the thread enters
fullAddCount()to initialize the array, allocate new cells, or resize the array.
- The thread maps to a specific index in the
- If
- Summation (
size()):- To calculate the total size,
ConcurrentHashMapsums thebaseCountand the values of all activeCounterCellinstances:final long sumCount() { CounterCell[] cs = counterCells; long sum = baseCount; if (cs != null) { for (CounterCell c : cs) { if (c != null) sum += c.value; } } return sum; } - This design localizes write contention to separate memory locations, ensuring that counting does not bottleneck map updates.
- To calculate the total size,
Summary of State and Concurrency Controls
The concurrent state and locking strategies of ConcurrentHashMap can be summarized by the following operational modes:
| Operation | Target State | Concurrency Control | Complexity |
|---|---|---|---|
get | Any | Lock-Free volatile read / ForwardingNode redirection | $O(1)$ / $O(\log n)$ |
put | Empty slot | Lock-Free CAS on bin index | $O(1)$ |
put | Non-empty bin | Fine-grained synchronized lock on bin head node | $O(1)$ / $O(\log n)$ |
put | Resizing bin | Thread assists with transfer via helpTransfer | $O(1)$ / $O(\log n)$ |
size | N/A | Non-blocking sum of baseCount and counterCells | $O(\text{cells})$ |
Through this multi-layered approach—combining CAS, lock-free reads, synchronized bucket heads, cooperative resizing, and striping-based size counters—ConcurrentHashMap delivers high throughput and thread-safe operations under extreme multi-threaded workloads.