Join Algorithms
[!NOTE] This module explores the core principles of Join Algorithms, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise. We’ll explore the Big-O complexity and memory hardware limits that dictate planner choices.
1. Introduction: The Mathematics of Matching
Imagine you’re managing a massive international library. Someone hands you a list of 10,000 author IDs and asks you to find their corresponding names from a master catalog of 10 million authors. How do you do it? Do you check the master catalog 10,000 times? Do you sort both lists first? Do you group them by country?
This is exactly the problem Postgres faces. When a query requests data from multiple tables, Postgres must determine the most efficient way to match the rows. This is the job of the Join Algorithm. The database essentially runs an algorithm to compute the intersection or combination of two sets.
Postgres implements three primary join algorithms:
- Nested Loop Join
- Hash Join
- Merge Join
The planner chooses the algorithm based on the estimated cost, which depends on table sizes, available memory (work_mem), and existing indexes.
2. Nested Loop Join
The simplest and most fundamental join algorithm.
How it Works
Analogy: Imagine matching your shopping list (Outer) against a grocery store’s aisles (Inner). For every single item on your list, you walk through the entire store looking for it.
It takes two tables (or relations): an Outer relation and an Inner relation.
- For each row in the Outer relation…
- Scan the Inner relation to find matching rows.
Complexity
- Time: O(N × M), where N is the number of outer rows and M is the number of inner rows.
- Space: O(1)
Variations
- Nested Loop with Seq Scan: Scans the entire inner table for every outer row. Extremely slow for large tables.
- Index Nested Loop Join: Uses an index on the inner table. Time becomes O(N × log M). This is very efficient when N is small and M is large (but indexed).
Best For
- Joining a small dataset (Outer) to a large, indexed dataset (Inner).
CROSS JOIN(Cartesian products).
3. Hash Join
Used when joining two large datasets where no useful index exists.
How it Works
- Build Phase: Postgres takes the smaller of the two relations (Inner) and builds an in-memory Hash Table using the join key.
- Probe Phase: It scans the larger relation (Outer) row by row. For each row, it hashes the join key and checks the Hash Table for matches.
Complexity
- Time: O(N + M) (linear time!).
- Space: O(M) (must store the inner relation in memory).
Hardware Reality & Memory (work_mem)
War Story: The “Out of Memory” Death Spiral A startup once ran a nightly reporting query joining a massive
orderstable to auserstable. They bumpedwork_memglobally to 2GB to make the Hash Join lightning fast. It worked perfectly—until Black Friday. A sudden spike of 50 concurrent reporting queries executed simultaneously. 50 queries × 2GB = 100GB of RAM requested instantly. The OS triggered the OOM (Out of Memory) Killer, crashing Postgres and taking the entire platform down. Lesson: Always tunework_memcarefully, and useSET LOCAL work_memfor specific heavy queries rather than a massive global default.
The hash table must fit in the RAM allocated by work_mem.
- In-Memory: If it fits, it’s extremely fast (pure RAM access, CPU cache friendly).
- Disk Spill: If it doesn’t fit, Postgres splits the join into batches and writes temporary files to disk. This is called a Hybrid Hash Join and is significantly slower due to disk I/O latency. Disk seeks and writes destroy performance.
- Tuning: Increasing
work_memcan prevent disk spills for Hash Joins, but setting it too high globally can lead to Out of Memory (OOM) kills if many concurrent connections run hash joins simultaneously.
Best For
- Large, unindexed joins.
- Equality joins (
=) only.
4. Merge Join
Used when both inputs are already sorted on the join key.
How it Works
- Sort both relations on the join key (unless they are already sorted by an index).
- Iterate through both lists simultaneously using two pointers (like the merge step in Merge Sort).
Complexity
- Time: O(N + M) (if already sorted) or O(N log N + M log M) (if sorting is required).
- Space: O(1) (very memory efficient).
Best For
- Large datasets that are already sorted (e.g., clustered by primary key or using a B-Tree index).
- Range joins (e.g.,
BETWEEN,>,<). - When
work_memis too small for a Hash Join.
5. Comparison Cheat Sheet
| Algorithm | Setup Cost | Per-Row Cost | Memory Usage | Pre-requisites | Best Use Case |
|---|---|---|---|---|---|
| Nested Loop | Low | High | Very Low | None | Small Outer table, Indexed Inner table |
| Hash Join | Medium (Hash) | Low | High | work_mem |
Large tables, Equality joins |
| Merge Join | High (Sort) | Very Low | Low | Sorted Inputs | Sorted large tables, Range joins |
6. Interactive: Join Algorithm Race
Visualize how different algorithms process data. Note how Hash Join is heavily front-loaded (Build phase), while Nested Loop starts immediately but runs slower per row.
Join Algorithm Visualizer
See how each algorithm processes data.
[!NOTE] Forcing Algorithms: You can temporarily force Postgres to use a specific method for testing:
SET enable_hashjoin = off; SET enable_mergejoin = off; -- Now likely to use Nested LoopUse this only for debugging, never in production code!
7. Next Steps
How does Postgres know how many rows are in a table to make these cost decisions? It uses statistics. Learn about the Statistics Collector.