Arrays & Hashing · LC #347
Top K Frequent Elements
Return the k most frequent integers from an array. Teaches bucket-sort as an O(n) alternative to a min-heap (O(n log k)) for frequency-rank problems.
Statement
Problem & Approach
Problem statement, sample I/O, and key reasoning.
Given an integer array `nums` and an integer `k`, return the `k` most frequent elements. The answer can be in any order.
Sample I/O
Input: nums = [1,1,1,2,2,3], k = 2 Output: [1,2] Input: nums = [1], k = 1 Output: [1]
Intuition
How to Think About It
Intuition
After counting frequencies, we know each number's frequency is an integer in [1, n]. We can use that bounded range as an index into a "bucket" array. The highest-frequency elements sit in the last buckets, so scanning from the back gives us the top-k in O(n) total.
Core Concept
Build a frequency map counting occurrences of each number (O(n)). Create a buckets array of length n+1 where buckets[f] holds all numbers with frequency f. Scan from index n down to 1, collecting elements until k elements are gathered. Time: O(n). Space: O(n). Compare with min-heap: O(n log k) time, O(k) extra space - simpler but slower.
When to use
Finding top-k by integer frequency when the range of possible frequencies is bounded by n. Anywhere you need O(n) over O(n log n) and frequencies are discrete integers. Streaming top-k with a sliding window.
When NOT to use
When k is very small and n is huge - a min-heap of size k uses O(k) space and may be preferred. When frequencies are not integers or are floating-point weights. When you need the k-th largest value, not frequency (use quickselect).
Key Insights
What to Know Cold
- Maximum possible frequency is n (all elements same), so the bucket array only needs n+1 slots.
- Bucket sort achieves O(n) by exploiting the bounded integer domain of frequencies - standard comparison sorts can't do better than O(n log n).
- A min-heap approach (Python heapq.nlargest) is O(n log k) - acceptable for interviews but sub-optimal.
- You can use Counter.most_common(k) in Python for a concise one-liner - still O(n log k) internally.
- The bucket scan must continue past partially-filled buckets to guarantee exactly k elements when multiple numbers share a frequency.
1 example
Worked Examples
Bucket Sort - O(n)
nums = [1,1,1,2,2,3], k=2. Return the 2 most frequent: [1, 2].
freq={1:3, 2:2, 3:1}. Bucket[3]=[1], bucket[2]=[2], bucket[1]=[3]. Scan from end: collect 1, then 2. Done at k=2.
from collections import Counter
nums, k = [1,1,1,2,2,3], 2
count = Counter(nums)
buckets = [[] for _ in range(len(nums) + 1)]
for num, freq in count.items():
buckets[freq].append(num)
res = []
for i in range(len(buckets)-1, 0, -1):
res.extend(buckets[i])
if len(res) >= k:
print(res[:k]) # [1, 2]
breakThis problem appears constantly in Amazon and Google phone screens. The bucket-sort O(n) trick over the naive heap O(n log k) is a concrete demonstration of "know your constraints" - a key senior-engineer signal.