For those who has no patience to look through this post, I conclude it in a word: HashSet is backed by HashMap.

Set is a collection in which every element is unique. So does the keys in HashMap. It is reasonable to infer that the implementation of HashSet is based on HashMap, so it is. Remember resize of HashMap? it makes no guarantees as to the iteration order of the set.

With the help of HashMap, HashSet offers constant time performance for the basic operations, like add, remove, contains and size. Iterating over this set requires time proportional to the sum of the HashSet instance’s size (the number of elements) plus the “capacity” of the backing HashMap instance (the number of buckets).

HashMap Underneath

HashSet involves a map variable referring to a HashMap, in which the elements are stored. All operations on HashSet will read or write this map accordingly.

// HashSet.java

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{
    private transient HashMap<E,Object> map;
    public HashSet() {
        map = new HashMap<>();
    }

     public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }
}

add

add adds the specified element to this set if not already present.

// HashSet.java

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

For PRESENT, it is a dummy value to associate with an Object in the backing Map.

The details of put method in HashMap lists here: put.

remove

remove removes the specified element from this set if present.

// HashSet.java

public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}

contains

contains inquire whether or not the given element exist in this set.

// HashSet.java

public boolean contains(Object o) {
    return map.containsKey(o);
}

Just like get in HashMap, containsKey also call getNode for further. See getNode.

size

If you want to get the size of HashSet, you actually query how many keys in HashMap.

// HashSet.java

public int size() {
    return map.size();
}

Note that this implementation is not synchronized. If multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set.