It’s necessary to learn with questions. Otherwise, you may distract your focus when something you don’t know appears, whether should I go deeper or not. With preliminary doubts in your mind, this would not be a problem. Learning can be direct, clear and fruitful, which brings you more sense of achievement.

For this post about the source code of HashMap, few questions need answers:

  • data structure underneath
  • why HashMap permits nulls and unsynchronized
  • does it make guarantees as to the order of the map
  • what happens when I put and get a key with value, and time complexity
  • how the map shrink or expand when size changes

A class diagrams is helpful and lists firstly.

Storage Structure

As the following code shows, key-values are stored in an array, which can be random access with index.

// HashMap.java

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {

    transient Node<K,V>[] table;
    transient int size;
    final float loadFactor;
    int threshold;
}

Other member variables mean:

  • size: the number of key-value mappings contained in this map.
  • loadFactor: the load factor for the hash table.
  • threshold: the next size value at which to resize (same as capacity * load factor).

The Node is a static and nested class in HashMap. It’s basic hash bin node, used for most entries. For briefness, I assume Node is used for all entries, but remember TreeNode is the other option.

// HashMap.java

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;
}

key, hash,value is easy to understand, and the next? It is used for handling key conflict.

Let’s see its equals method with compare order:

  • address
  • instance identity
  • equals from Objects
// HashMap.java

static class Node<K,V> implements Map.Entry<K,V> {
    public final boolean equals(Object o) {
        if (o == this)
            return true;
        if (o instanceof Map.Entry) {
            Map.Entry<?,?> e = (Map.Entry<?,?>)o;
            if (Objects.equals(key, e.getKey()) &&
                Objects.equals(value, e.getValue()))
                return true;
        }
        return false;
    }
}

put

put associates the specified value with the specified key in this map. The old value is replaced if the key is already contained in the map.

// HashMap.java

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

Pseudocode that summarizes steps of put lists here:

  1. resize the map if table is null or size is 0
  2. new a node if the key does not exist, and insert it into table
  3. if the hash exists, append to the end of list
  4. resize if threshold is exceeded

resize

As we can see from putting a new key-value into HashMap, it is possible to trigger resize, which would double table size. The elements from each bin must either stay at same index, or move with a power of two offset in the new table.

// HashMap.java

Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;

for (int j = 0; j < oldCap; ++j) {
    Node<K,V> e;
    if ((e = oldTab[j]) != null) {
        oldTab[j] = null;
        if (e.next == null)
            newTab[e.hash & (newCap - 1)] = e;
        else if (e instanceof TreeNode)
            ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
        else { // preserve order
            Node<K,V> loHead = null, loTail = null;
            Node<K,V> hiHead = null, hiTail = null;
            Node<K,V> next;
            do {
                next = e.next;
                if ((e.hash & oldCap) == 0) {
                    if (loTail == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                }
                else {
                    if (hiTail == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                }
            } while ((e = next) != null);
            if (loTail != null) {
                loTail.next = null;
                newTab[j] = loHead;
            }
            if (hiTail != null) {
                hiTail.next = null;
                newTab[j + oldCap] = hiHead;
            }
        }
    }
}

Two points should be highlighted:

  • The elements from each bin stay at same index if only (e.hash & oldCap) == 0
  • The order in new table is same as the one in old table.

get

get returns the value to which the specified key is mapped, or null if this map contains no mapping for the key.

// HashMap.java

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

The iteration is comprehensive,

  1. locate the index of table based on hash value
  2. check if first node is eligible
  3. traverse the list and check one by one

Conclusion

Now, we can answer the questions mentioned at the beginning of this post.

data structure underneath

Beneath the HashMap, an array with its elements Node is used for storage, and it is friendly to random access.

why HashMap permits nulls and unsynchronized

When putting a new key-value, newNode is called, which does not check nulls.

After scanning the source code of HashMap, there is no code to deal with synchronization. So it is unsynchronized.

does it make guarantees as to the order of the map

No, resize of the table may distort the order.

what happens when I put and get a key with value, and time complexity

put may lead to resize of the map, so it’s time complexity is O(n). While the element can be located with index, so get costs O(1) for the optimal situation. If key conflicts, it needs to iterate the whole list sharing the same key.

how the map shrink or expand when size changes

The question has already been answered. expand? yes! and shrink? no!