Scaling the Zipper
[!NOTE] This module explores the core principles of merging multiple sorted streams, deriving optimized solutions from first principles and hardware constraints (Priority Queues vs. Recursive Merging).
1. Problem Definition
You are given an array of k linked-lists lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.
Example:
- Input:
[[1,4,5], [1,3,4], [2,6]] - Output:
[1,1,2,3,4,4,5,6]
Constraints:
- k == lists.length
- 0 ≤ k ≤ 10^4
- 0 ≤ lists[i].length ≤ 500
- Sum of all lengths ≤ 10^4.
2. Interactive Analysis
Merging k lists is like merging two lists, but with k candidates at every step.
Input Lists
Min-Heap (Size k)
Result List
3. Solutions
Intuition: The Bottleneck
A naive approach scans all k heads to find the minimum. In a hardware context, this is a linear scan O(k) for every element. If we have 10^4 lists, we are doing 10^4 comparisons for every node.
Derivation: We only need the global minimum among current candidates. A Min-Heap (Priority Queue) allows us to find and remove the minimum in O(log k) and insert a replacement in O(log k).
Approach 1: Min-Heap (Optimal Time)
public ListNode mergeKLists(ListNode[] lists) {
if (lists == null || lists.length == 0) return null;
PriorityQueue<ListNode> pq = new PriorityQueue<>((a, b) -> a.val - b.val);
// Initial k nodes
for (ListNode node : lists) {
if (node != null) pq.add(node);
}
ListNode dummy = new ListNode(-1);
ListNode tail = dummy;
while (!pq.isEmpty()) {
ListNode current = pq.poll();
tail.next = current;
tail = tail.next;
if (current.next != null) {
pq.add(current.next);
}
}
return dummy.next;
}
func mergeKLists(lists []*ListNode) *ListNode {
if len(lists) == 0 { return nil }
h := &NodeHeap{}
heap.Init(h)
for _, node := range lists {
if node != nil {
heap.Push(h, node)
}
}
dummy := &ListNode{}
tail := dummy
for h.Len() > 0 {
node := heap.Pop(h).(*ListNode)
tail.Next = node
tail = tail.Next
if node.Next != nil {
heap.Push(h, node.Next)
}
}
return dummy.Next
}
// Boilerplate for Go Heap
type NodeHeap []*ListNode
func (h NodeHeap) Len() int { return len(h) }
func (h NodeHeap) Less(i, j int) bool { return h[i].Val < h[j].Val }
func (h NodeHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h *NodeHeap) Push(x interface{}) { *h = append(*h, x.(*ListNode)) }
func (h *NodeHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
return x
}
Approach 2: Divide and Conquer (Optimal Space)
Instead of merging all at once with a heap, merge pairs of lists. This reduces k lists to k/2, then k/4, until 1 list remains.
public ListNode mergeKLists(ListNode[] lists) {
if (lists == null || lists.length == 0) return null;
return divide(lists, 0, lists.length - 1);
}
private ListNode divide(ListNode[] lists, int i, int j) {
if (i == j) return lists[i];
int mid = i + (j - i) / 2;
ListNode left = divide(lists, i, mid);
ListNode right = divide(lists, mid + 1, j);
return mergeTwo(left, right);
}
private ListNode mergeTwo(ListNode l1, ListNode l2) {
ListNode dummy = new ListNode(0);
ListNode curr = dummy;
while (l1 != null && l2 != null) {
if (l1.val < l2.val) { curr.next = l1; l1 = l1.next; }
else { curr.next = l2; l2 = l2.next; }
curr = curr.next;
}
curr.next = (l1 != null) ? l1 : l2;
return dummy.next;
}
func mergeKLists(lists []*ListNode) *ListNode {
if len(lists) == 0 { return nil }
return divide(lists, 0, len(lists)-1)
}
func divide(lists []*ListNode, i, j int) *ListNode {
if i == j { return lists[i] }
mid := i + (j-i)/2
left := divide(lists, i, mid)
right := divide(lists, mid+1, j)
return mergeTwo(left, right)
}
func mergeTwo(l1, l2 *ListNode) *ListNode {
dummy := &ListNode{}
curr := dummy
for l1 != nil && l2 != nil {
if l1.Val < l2.Val {
curr.Next = l1
l1 = l1.Next
} else {
curr.Next = l2
l2 = l2.Next
}
curr = curr.Next
}
if l1 != nil { curr.Next = l1
} else { curr.Next = l2 }
return dummy.Next
}
4. Complexity Analysis
| Strategy | Time Complexity | Space Complexity | Hardware Context |
|---|---|---|---|
| Naive (Linear Scan) | O(N ⋅ k) | O(1) | Cache misses due to multiple pointer chasing. |
| Min-Heap | O(N log k) | O(k) | Efficient insertion in heap (logarithmic). |
| Divide & Conquer | O(N log k) | O(1) | Stack depth log k, no extra data structures. |
[!IMPORTANT] N is the total number of nodes across all lists. k is the number of lists.