HashMap HashSet

Imagine you are looking for a specific book in a massive library with millions of volumes. If the books were scattered randomly, you’d have to check every single one (an $O(N)$ operation). But what if the librarian had a magical index card catalog? You give them a book title, and they instantly point you to the exact shelf and position ($O(1)$).

That “magical catalog” is a Hash Table.

If Arrays are the “bread” of programming, HashMaps are the “butter”. They are the default choice for storing data when you need incredibly fast lookups.

The Two Flavors

  1. HashMap (Dictionary in Python): Stores Key → Value pairs.
    • Keys must be unique.
    • Values can be duplicated.
    • Think: A phone book (Name → Number).
  2. HashSet (Set in Python): Stores Key only.
    • Keys must be unique.
    • Think: A guest list (Names only, no duplicates allowed).

Interactive Playground

Experiment with how a Map handles data. Notice what happens when you add a duplicate key!

HashMap Simulator

Ready.
KEY
VALUE
Map is empty

Implementation Details

Under the hood, both Python’s dict and Java’s HashMap use a Hash Table.

Map Example

Maps are highly optimized hash tables.

Java
Go
```java // 1. Create Map<String, Integer> grades = new HashMap<>(); // 2. Add (Put) - O(1) grades.put("Alice", 95); grades.put("Bob", 82); // 3. Access (Get) - O(1) System.out.println(grades.get("Alice")); // 95 // 4. Check Key - O(1) if (!grades.containsKey("Charlie")) { System.out.println("Charlie is missing"); } // 5. Remove - O(1) grades.remove("Bob"); ```
```go // 1. Create grades := make(map[string]int) // 2. Add (Put) - O(1) grades["Alice"] = 95 grades["Bob"] = 82 // 3. Access (Get) - O(1) fmt.Println(grades["Alice"]) // 95 // 4. Check Key - O(1) if _, exists := grades["Charlie"]; !exists { fmt.Println("Charlie is missing") } // 5. Remove - O(1) delete(grades, "Bob") ```

Set Example

Use a set when you only care about uniqueness, not the associated value.

Java
Go
```java // 1. Create Set uniqueIds = new HashSet<>(); // 2. Add - O(1) uniqueIds.add(101); uniqueIds.add(102); uniqueIds.add(101); // Duplicate ignored! System.out.println(uniqueIds.size()); // Output: 2 // 3. Check Membership - O(1) if (uniqueIds.contains(101)) { System.out.println("ID 101 exists"); } ``` </div>
```go // 1. Create (Go uses map with empty struct for sets) uniqueIds := make(map[int]struct{}) // 2. Add - O(1) uniqueIds[101] = struct{}{} uniqueIds[102] = struct{}{} uniqueIds[101] = struct{}{} // Duplicate ignored! fmt.Println(len(uniqueIds)) // Output: 2 // 3. Check Membership - O(1) if _, exists := uniqueIds[101]; exists { fmt.Println("ID 101 exists") } ```
</div> ## Amortized Complexity & Hash Collisions Under the hood, a Hash Table works by passing a key through a **Hash Function**, which spits out an integer index. ### Hash Collisions What happens if two different keys (like "Alice" and "Bob") hash to the exact same index? This is called a **Hash Collision**. There are two main ways to handle this: 1. **Chaining (Most Common)**: Instead of storing just one value at an index, we store a Linked List of values. If a collision occurs, we just append the new value to the list. When looking up, we hash the key, go to the index, and traverse the short list to find the exact key. 2. **Open Addressing**: If the target index is full, we simply look for the next available empty slot (e.g., index + 1, index + 2). ### Amortized $O(1)$ Why do we say Maps have **Amortized $O(1)$** lookup? Occasionally, the hash table fills up. When the **Load Factor** (items / capacity) exceeds a threshold (usually 0.75), the table must **resize** (usually double in size). * **Resizing**: Takes **$O(N)$** time because every item must be re-hashed to a new index. * **Amortized**: Since resizing happens rarely (exponentially less often), the average cost per operation remains **$O(1)$**. ### Common Pitfall: Mutable Keys > [!WARNING] > Never use **mutable objects** (like Arrays or Lists) as keys in a HashMap or HashSet. If you change the object after putting it in the Map, its hash code changes. The Map will try to look for it in the wrong bucket, and the object will be "lost" forever inside the data structure! Always use immutable types (Strings, Integers, Tuples) as keys. > [!IMPORTANT] > Always prefer `HashMap`/`HashSet` over `List` for lookups. Searching a list is **$O(N)$**. Searching a map is **$O(1)$**. This is the single most common optimization pattern in algorithmic interviews. ---