Ranking & Ordering
[!NOTE] This module explores the core principles of Heaps Practice Problems, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. Problem Definition
We are tasked with solving the Top K Frequent Elements problem (which is representative of the broader “Top K” pattern).
Problem Statement: Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Constraints:
1 <= nums.length <= 10^5-10^4 <= nums[i] <= 10^4kis in the range[1, the number of unique elements in the array].- It is guaranteed that the answer is unique.
Examples:
- Input:
nums = [1,1,1,2,2,3],k = 2Output:[1, 2] - Input:
nums = [1],k = 1Output:[1]
Edge Cases & Clarifying Questions
- Q: Can
kbe larger than the number of unique elements? A: No,kis guaranteed to be valid according to the constraints. - Q: What if multiple elements have the same frequency?
A: The problem guarantees a unique answer, so there will not be a tie for the
k-th element that would require a tie-breaker. - Edge Cases: Single element arrays, all elements identical, negative numbers.
2. Interactive Analysis: Kth Largest Pattern
The underlying pattern for “Top K” problems is maintaining a Heap of size K. Let’s visualize this with a simpler variation: finding the Kth Largest element in a stream. We keep a Min-Heap of size K (in this demo, K=3).
If a new number is larger than the Heap’s minimum (root), we replace it. The root is always the Kth largest.
3. Intuition (The “Genesis”)
How do we efficiently track the most frequent items?
The Brute Force Approach
The simplest way is to count the frequencies using a Hash Map, then put all unique elements into an array and sort them by their frequency in descending order.
- Counting frequencies takes O(N) time.
- Sorting the unique elements takes O(U log U) time, where U is the number of unique elements. In the worst case (all elements unique), this is O(N log N).
While O(N log N) is acceptable, it’s not optimal when k is small compared to N. We are doing unnecessary work sorting the entire array when we only care about the top k.
Transition to Optimized: The Top-K Pattern
We can optimize this using a Priority Queue (Heap). This pattern is explicitly called the Top-K Pattern.
Instead of sorting everything, we maintain a Min-Heap of size k.
- Count Frequencies: Hash Map O(N).
- Keep Top K: Iterate through the unique elements. Insert each element into the Min-Heap, ordered by its frequency.
- Evict the Weak: If the heap size exceeds
k, we pop the minimum element (the one with the lowest frequency currently in the heap). The survivors are thekmost frequent.
Common Pitfalls
- Using a Max-Heap instead of a Min-Heap: It’s counter-intuitive, but to find the most frequent, you use a Min-Heap of size
K. Why? Because you want to efficiently find and evict the least frequent elements among the top candidates. A Max-Heap would require storing allNelements and poppingKtimes (O(N + K log N)), which takes O(N) space. A Min-Heap of sizeKtakes O(K) space and O(N log K) time. - Storing the wrong data in the Heap: Ensure you compare by frequency, not by the value of the element itself.
4. Solutions
public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Integer> count = new HashMap<>();
for (int n : nums) count.put(n, count.getOrDefault(n, 0) + 1);
// Min-Heap based on frequency
PriorityQueue<Integer> heap = new PriorityQueue<>((a, b) -> count.get(a) - count.get(b));
for (int n : count.keySet()) {
heap.add(n);
if (heap.size() > k) heap.poll(); // Remove least frequent
}
int[] res = new int[k];
for (int i = 0; i < k; i++) res[i] = heap.poll();
return res;
}
func topKFrequent(nums []int, k int) []int {
count := make(map[int]int)
for _, n := range nums { count[n]++ }
h := &MinHeap{}
heap.Init(h)
for num, freq := range count {
heap.Push(h, Element{num, freq})
if h.Len() > k {
heap.Pop(h)
}
}
res := make([]int, k)
for i := 0; i < k; i++ {
res[i] = heap.Pop(h).(Element).val
}
return res
}
// Boilerplate Go Heap omitted for brevity
5. Complexity Analysis
| Strategy | Time | Space | Notes |
|---|---|---|---|
| Sorting | O(N log N) | O(N) | Simple, but slower for large N. |
| Min-Heap | O(N log K) | O(K) | Optimal for K « N (Streaming compatible). |
| QuickSelect | O(N) Avg | O(1) | Modifies array, not stable. |