Solution For Find Median From Data Stream
Problem:
The problem “Find Median From Data Stream” asks us to design a class that takes a stream of integers as input and returns the median of all the integers in the stream at any given time. The class should be able to handle dynamic insertion of new integers in the stream. The median is defined as the middle element when the integers are sorted in ascending order. If there are an even number of integers in the stream, the median is the average of the two middle elements.
Solution:
The solution to this problem involves finding an efficient way to insert new integers into the stream and retrieve the median at any given time. One efficient approach is to use two heaps to track the median.
We can use a max-heap and a min-heap to keep track of the larger half and smaller half of the stream, respectively. The max-heap holds the smaller half of the stream, and the min-heap holds the larger half. We maintain the property that the size of the max-heap is either equal to or one greater than the size of the min-heap. This ensures that the median can be easily computed as the maximum value in the max-heap when their sizes are equal, and as the average of the maximum value in the max-heap and the minimum value in the min-heap when their sizes differ by one.
To insert a new integer into the stream, we first compare it with the maximum value in the max-heap. If the new integer is less than or equal to the maximum value, we insert it into the max-heap. Otherwise, we insert it into the min-heap. We then balance the size of the heaps so that their sizes differ by at most one.
To retrieve the median at any given time, we check the sizes of the max-heap and the min-heap. If they are equal, we return the maximum value in the max-heap. If their sizes differ by one, we return the average of the maximum value in the max-heap and the minimum value in the min-heap.
Here is the implementation of the class using two heaps in Python:
“`
import heapq
class MedianFinder:
def init(self):
“””
Initialize your data structure here.
“””
self.max_heap = [] # contains smaller half of the stream
self.min_heap = [] # contains larger half of the stream
def addNum(self, num: int) -> None:
if len(self.max_heap) == len(self.min_heap):
if not self.min_heap or num <= self.min_heap[0]:
heapq.heappush(self.max_heap, -num)
else:
heapq.heappush(self.max_heap, -heapq.heappushpop(self.min_heap, num))
else:
if num >= -self.max_heap[0]:
heapq.heappush(self.min_heap, num)
else:
heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))
heapq.heappush(self.max_heap, -num)
def findMedian(self) -> float:
if len(self.max_heap) == len(self.min_heap):
return (-self.max_heap[0] + self.min_heap[0]) / 2
else:
return -self.max_heap[0]
“`
In the addNum
method, we first check if the size of the two heaps is equal. If so, we compare the new number with the minimum value in the min-heap. If the new number is less than or equal to the minimum value, we insert it into the max-heap. Otherwise, we insert it into the min-heap after first popping the minimum value from the min-heap and pushing it into the max-heap. If the size of the two heaps is not equal, we compare the new number with the maximum value in the max-heap. If the new number is greater than or equal to the maximum value, we insert it into the min-heap. Otherwise, we swap the maximum value of the max-heap with the new number and insert the swapped value into the min-heap. We then balance the size of the two heaps so that their sizes differ by at most one.
In the findMedian
method, we first check if the size of the two heaps is equal. If so, we return the average of the maximum value in the max-heap and the minimum value in the min-heap. If the size of the two heaps is not equal, we return the maximum value in the max-heap.
Step by Step Implementation For Find Median From Data Stream
/** * Your MedianFinder object will be instantiated and called as such: * MedianFinder obj = new MedianFinder(); * obj.addNum(num); * double param_2 = obj.findMedian(); */ class MedianFinder { private PriorityQueueminHeap; private PriorityQueue maxHeap; /** initialize your data structure here. */ public MedianFinder() { minHeap = new PriorityQueue<>(); maxHeap = new PriorityQueue<>(Comparator.reverseOrder()); } public void addNum(int num) { maxHeap.add(num); minHeap.add(maxHeap.poll()); if (maxHeap.size() < minHeap.size()) { maxHeap.add(minHeap.poll()); } } public double findMedian() { if (maxHeap.size() == minHeap.size()) { return (maxHeap.peek() + minHeap.peek()) / 2.0; } else { return maxHeap.peek(); } } }
class MedianFinder: def __init__(self): """ Initialize your data structure here. """ self.arr = [] def addNum(self, num: int) -> None: """ Adds a num into the data structure. """ self.arr.append(num) def findMedian(self) -> float: """ Returns the median of current data stream """ self.arr.sort() n = len(self.arr) if n%2 == 0: return (self.arr[n//2] + self.arr[n//2-1])/2 else: return self.arr[n//2] # Your MedianFinder object will be instantiated and called as such: # obj = MedianFinder() # obj.addNum(num) # param_2 = obj.findMedian()
var findMedianFromDataStream = function(nums) { // create a min heap and a max heap let minHeap = new Heap(function(a, b) { return a - b }); let maxHeap = new Heap(function(a, b) { return b - a }); // insert the first element into the max heap maxHeap.insert(nums[0]); // iterate through the remaining elements for (let i = 1; i < nums.length; i++) { let num = nums[i]; // if the number is less than the max of the max heap, insert into the max heap if (num < maxHeap.getMax()) { maxHeap.insert(num); } else { // otherwise, insert into the min heap minHeap.insert(num); } // balance the heaps if necessary if (maxHeap.size() > minHeap.size() + 1) { minHeap.insert(maxHeap.extractMax()); } else if (minHeap.size() > maxHeap.size()) { maxHeap.insert(minHeap.extractMin()); } } // return the median if (maxHeap.size() === minHeap.size()) { return (maxHeap.getMax() + minHeap.getMin()) / 2; } else { return maxHeap.getMax(); } };
class MedianFinder { public: //Adds a number into the data structure. void addNum(int num) { } //Returns the median of current data stream double findMedian() { } };
using System; using System.Collections.Generic; using System.Linq; public class MedianFinder { // max heap to store the smaller half of the numbers private readonly SortedSet_smaller = new SortedSet (); // min heap to store the larger half of the numbers private readonly SortedSet _larger = new SortedSet (); // Adds a num into the data structure. public void AddNum(int num) { // if the heaps are unbalanced, add the new number to the heap // with fewer elements if (_smaller.Count == _larger.Count) { if (_larger.Count > 0 && num > _larger.Min) { _smaller.Add(_larger.Min); _larger.Remove(_larger.Min); _larger.Add(num); } else { _smaller.Add(num); } } else { if (num < _smaller.Max) { _larger.Add(_smaller.Max); _smaller.Remove(_smaller.Max); _smaller.Add(num); } else { _larger.Add(num); } } } // Returns the median of current data stream public double FindMedian() { if (_smaller.Count == _larger.Count) { return ((double)_smaller.Max + _larger.Min) / 2; } else { return _smaller.Max; } } };