CS205

Load Factor & Rehashing

Watch what happens when a HashMap gets too full and needs to resize.

What is Load Factor?

The load factor measures how full the hash table is:

loadFactor = size / capacity
  • Java's default: 0.75 (75% full triggers resize)
  • Higher load factor: More space efficient, more collisions
  • Lower load factor: Fewer collisions, wastes space
When Does Rehashing Occur?

Rehashing is triggered when:

size > capacity × loadFactor

Example: With capacity=8 and loadFactor=0.75:

threshold = 8 × 0.75 = 6
When size > 6, rehashing occurs
Rehashing Demo
Next keys:applebananacherrydateelderberry...

Click "Add Entry" to start adding items

Size: 0
Capacity: 8
Load Factor: 0.00
Threshold (6)
0 / 8 entries (0.0% full)
6 until rehash
threshold = capacity × loadFactor = 8 × 0.75 = 6
New TableCapacity: 8
IndexEntries
0(empty)
1(empty)
2(empty)
3(empty)
4(empty)
5(empty)
6(empty)
7(empty)
The Rehashing Process
  1. 1

    Create New Array

    A new bucket array is created with double the capacity.

  2. 2

    Recalculate All Indices

    For each entry, compute new index: hashCode % newCapacity. Indices will change because capacity changed!

  3. 3

    Move All Entries

    Each entry is inserted into its new bucket in the larger array.

  4. 4

    Update Threshold

    New threshold = newCapacity × loadFactor. The map can now hold more entries.

Why Do Indices Change?

The bucket index depends on the capacity through the modulo operation. When capacity changes, the index changes too:

Old Capacity = 8

"apple" (hash=93029210) % 8 = 2
"banana" (hash=98715246) % 8 = 6
"cherry" (hash=94615849) % 8 = 1

New Capacity = 16

"apple" (hash=93029210) % 16 = 10
"banana" (hash=98715246) % 16 = 14
"cherry" (hash=94615849) % 16 = 9

Notice how all indices changed! This is why we must recalculate positions for every entry.

Performance Impact of Rehashing

Cost of Rehashing

  • • O(n) time to move all entries
  • • Temporarily doubles memory usage
  • • Can cause latency spikes

Why It's Worth It

  • • Maintains O(1) average operations
  • • Reduces collision chains
  • • Amortized cost is still O(1) per insert

Tip: If you know the approximate size ahead of time, initialize the HashMap with that capacity to avoid rehashing:

Map<String, Integer> map = new HashMap<>(1000);