Scaling the Zipper
NOTE This module explores the core principles of merging multiple sorted streams, deriving optimized solutions from first principles and hardware constraints (Priority Queues vs. Recursive Merging).
1. Problem Definition
You are given an array of k linked-lists lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.
Example:
- Input:
[[1,4,5], [1,3,4], [2,6]] - Output:
[1,1,2,3,4,4,5,6]
Constraints:
- k == lists.length
- 0 ≤ k ≤ 104
- 0 ≤ lists[i].length ≤ 500
- Sum of all lengths ≤ 104.
Edge Cases & Clarifying Questions:
- What if
listsis empty? Returnnull(or empty list). - What if all linked lists in
listsare empty? Returnnull. - What if
kis very large but most lists are empty? The algorithm should handle empty lists efficiently without adding them to the heap.
2. Interactive Analysis
Merging k lists is like merging two lists, but with k candidates at every step.
Input Lists
Min-Heap (Size k)
Result List
3. Solutions
Intuition: The Bottleneck
A naive approach scans all k heads to find the minimum. In a hardware context, this is a linear scan O(k) for every element. If we have 104 lists, we are doing 104 comparisons for every node.
Derivation: We only need the global minimum among current candidates. A Min-Heap (Priority Queue) allows us to find and remove the minimum in O(log k) and insert a replacement in O(log k).
Common Pitfalls
- Null Pointer Exceptions: Failing to check if a list is empty before adding its head to the priority queue.
- Custom Comparator Errors: Forgetting to define a proper comparator for the min-heap to sort by node values.
- Modifying Original Lists: Accidentally losing references or creating cycles if not careful with pointer assignments.
Approach 1: Min-Heap (Optimal Time)
Java
public ListNode mergeKLists(ListNode[] lists) {
if (lists == null || lists.length == 0) return null;
PriorityQueue<ListNode> pq = new PriorityQueue<>((a, b) -> a.val - b.val);
// Initial k nodes
for (ListNode node : lists) {
if (node != null) pq.add(node);
}
ListNode dummy = new ListNode(-1);
ListNode tail = dummy;
while (!pq.isEmpty()) {
ListNode current = pq.poll();
tail.next = current;
tail = tail.next;
if (current.next != null) {
pq.add(current.next);
}
}
return dummy.next;
}
Go
func mergeKLists(lists []*ListNode) *ListNode {
if len(lists) == 0 { return nil }
h := &NodeHeap{}
heap.Init(h)
for _, node := range lists {
if node != nil {
heap.Push(h, node)
}
}
dummy := &ListNode{}
tail := dummy
for h.Len() > 0 {
node := heap.Pop(h).(*ListNode)
tail.Next = node
tail = tail.Next
if node.Next != nil {
heap.Push(h, node.Next)
}
}
return dummy.Next
}
// Boilerplate for Go Heap
type NodeHeap []*ListNode
func (h NodeHeap) Len() int { return len(h) }
func (h NodeHeap) Less(i, j int) bool { return h[i].Val < h[j].Val }
func (h NodeHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h *NodeHeap) Push(x interface{}) { *h = append(*h, x.(*ListNode)) }
func (h *NodeHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
return x
}
Approach 2: Divide and Conquer (Optimal Space)
Instead of merging all at once with a heap, merge pairs of lists. This reduces k lists to k/2, then k/4, until 1 list remains.
Java
public ListNode mergeKLists(ListNode[] lists) {
if (lists == null || lists.length == 0) return null;
return divide(lists, 0, lists.length - 1);
}
private ListNode divide(ListNode[] lists, int i, int j) {
if (i == j) return lists[i];
int mid = i + (j - i) / 2;
ListNode left = divide(lists, i, mid);
ListNode right = divide(lists, mid + 1, j);
return mergeTwo(left, right);
}
private ListNode mergeTwo(ListNode l1, ListNode l2) {
ListNode dummy = new ListNode(0);
ListNode curr = dummy;
while (l1 != null && l2 != null) {
if (l1.val < l2.val) { curr.next = l1; l1 = l1.next; }
else { curr.next = l2; l2 = l2.next; }
curr = curr.next;
}
curr.next = (l1 != null) ? l1 : l2;
return dummy.next;
}
Go
func mergeKLists(lists []*ListNode) *ListNode {
if len(lists) == 0 { return nil }
return divide(lists, 0, len(lists)-1)
}
func divide(lists []*ListNode, i, j int) *ListNode {
if i == j { return lists[i] }
mid := i + (j-i)/2
left := divide(lists, i, mid)
right := divide(lists, mid+1, j)
return mergeTwo(left, right)
}
func mergeTwo(l1, l2 *ListNode) *ListNode {
dummy := &ListNode{}
curr := dummy
for l1 != nil && l2 != nil {
if l1.Val < l2.Val {
curr.Next = l1
l1 = l1.Next
} else {
curr.Next = l2
l2 = l2.Next
}
curr = curr.Next
}
if l1 != nil { curr.Next = l1
} else { curr.Next = l2 }
return dummy.Next
}
4. Complexity Analysis
| Strategy | Time Complexity | Space Complexity | Hardware Context |
|---|---|---|---|
| Naive (Linear Scan) | O(N × k) | O(1) | Cache misses due to multiple pointer chasing. |
| Min-Heap | O(N log k) | O(k) | Efficient insertion in heap (logarithmic). |
| Divide & Conquer | O(N log k) | O(1) | Stack depth log k, no extra data structures. |
IMPORTANT N is the total number of nodes across all lists. k is the number of lists.