Scaling the Zipper

NOTE This module explores the core principles of merging multiple sorted streams, deriving optimized solutions from first principles and hardware constraints (Priority Queues vs. Recursive Merging).

1. Problem Definition

You are given an array of k linked-lists lists, each linked-list is sorted in ascending order. Merge all the linked-lists into one sorted linked-list and return it.

Example:

  • Input: [[1,4,5], [1,3,4], [2,6]]
  • Output: [1,1,2,3,4,4,5,6]

Constraints:

  • k == lists.length
  • 0 ≤ k ≤ 104
  • 0 ≤ lists[i].length ≤ 500
  • Sum of all lengths ≤ 104.

Edge Cases & Clarifying Questions:

  • What if lists is empty? Return null (or empty list).
  • What if all linked lists in lists are empty? Return null.
  • What if k is very large but most lists are empty? The algorithm should handle empty lists efficiently without adding them to the heap.

2. Interactive Analysis

Merging k lists is like merging two lists, but with k candidates at every step.

Input Lists

Min-Heap (Size k)

Result List

Ready.

3. Solutions

Intuition: The Bottleneck

A naive approach scans all k heads to find the minimum. In a hardware context, this is a linear scan O(k) for every element. If we have 104 lists, we are doing 104 comparisons for every node.

Derivation: We only need the global minimum among current candidates. A Min-Heap (Priority Queue) allows us to find and remove the minimum in O(log k) and insert a replacement in O(log k).

Common Pitfalls

  • Null Pointer Exceptions: Failing to check if a list is empty before adding its head to the priority queue.
  • Custom Comparator Errors: Forgetting to define a proper comparator for the min-heap to sort by node values.
  • Modifying Original Lists: Accidentally losing references or creating cycles if not careful with pointer assignments.

Approach 1: Min-Heap (Optimal Time)

Java

public ListNode mergeKLists(ListNode[] lists) {
  if (lists == null || lists.length == 0) return null;

  PriorityQueue<ListNode> pq = new PriorityQueue<>((a, b) -> a.val - b.val);

  // Initial k nodes
  for (ListNode node : lists) {
    if (node != null) pq.add(node);
  }

  ListNode dummy = new ListNode(-1);
  ListNode tail = dummy;

  while (!pq.isEmpty()) {
    ListNode current = pq.poll();
    tail.next = current;
    tail = tail.next;

    if (current.next != null) {
      pq.add(current.next);
    }
  }

  return dummy.next;
}

Go

func mergeKLists(lists []*ListNode) *ListNode {
  if len(lists) == 0 { return nil }

  h := &NodeHeap{}
  heap.Init(h)

  for _, node := range lists {
    if node != nil {
      heap.Push(h, node)
    }
  }

  dummy := &ListNode{}
  tail := dummy

  for h.Len() > 0 {
    node := heap.Pop(h).(*ListNode)
    tail.Next = node
    tail = tail.Next

    if node.Next != nil {
      heap.Push(h, node.Next)
    }
  }

  return dummy.Next
}

// Boilerplate for Go Heap
type NodeHeap []*ListNode
func (h NodeHeap) Len() int           { return len(h) }
func (h NodeHeap) Less(i, j int) bool { return h[i].Val < h[j].Val }
func (h NodeHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }
func (h *NodeHeap) Push(x interface{}) { *h = append(*h, x.(*ListNode)) }
func (h *NodeHeap) Pop() interface{} {
  old := *h
  n := len(old)
  x := old[n-1]
  *h = old[0 : n-1]
  return x
}

Approach 2: Divide and Conquer (Optimal Space)

Instead of merging all at once with a heap, merge pairs of lists. This reduces k lists to k/2, then k/4, until 1 list remains.

Java

public ListNode mergeKLists(ListNode[] lists) {
  if (lists == null || lists.length == 0) return null;
  return divide(lists, 0, lists.length - 1);
}

private ListNode divide(ListNode[] lists, int i, int j) {
  if (i == j) return lists[i];
  int mid = i + (j - i) / 2;
  ListNode left = divide(lists, i, mid);
  ListNode right = divide(lists, mid + 1, j);
  return mergeTwo(left, right);
}

private ListNode mergeTwo(ListNode l1, ListNode l2) {
  ListNode dummy = new ListNode(0);
  ListNode curr = dummy;
  while (l1 != null && l2 != null) {
    if (l1.val < l2.val) { curr.next = l1; l1 = l1.next; }
    else { curr.next = l2; l2 = l2.next; }
    curr = curr.next;
  }
  curr.next = (l1 != null) ? l1 : l2;
  return dummy.next;
}

Go

func mergeKLists(lists []*ListNode) *ListNode {
  if len(lists) == 0 { return nil }
  return divide(lists, 0, len(lists)-1)
}

func divide(lists []*ListNode, i, j int) *ListNode {
  if i == j { return lists[i] }
  mid := i + (j-i)/2
  left := divide(lists, i, mid)
  right := divide(lists, mid+1, j)
  return mergeTwo(left, right)
}

func mergeTwo(l1, l2 *ListNode) *ListNode {
  dummy := &ListNode{}
  curr := dummy
  for l1 != nil && l2 != nil {
    if l1.Val < l2.Val {
      curr.Next = l1
      l1 = l1.Next
    } else {
      curr.Next = l2
      l2 = l2.Next
    }
    curr = curr.Next
  }
  if l1 != nil { curr.Next = l1
  } else { curr.Next = l2 }
  return dummy.Next
}

4. Complexity Analysis

Strategy Time Complexity Space Complexity Hardware Context
Naive (Linear Scan) O(N × k) O(1) Cache misses due to multiple pointer chasing.
Min-Heap O(N log k) O(k) Efficient insertion in heap (logarithmic).
Divide & Conquer O(N log k) O(1) Stack depth log k, no extra data structures.

IMPORTANT N is the total number of nodes across all lists. k is the number of lists.