Given a sorted array of n
elements which can contain duplicates. Find the frequency of each distinct element in the array.
Example Test Case
Given Array: 2 2 2 4 5 6 6 7 7 7
Expected Output:2: 3
4: 1
5: 1
6: 2
7: 3
Solution
If the array was unsorted, then we could have solved this problem using a simple HashMap, but since the given array is sorted, we can use this information and use modified binary search to solve this problem even more efficiently.
Since the array is sorted, elements which are same will occur together. To count the frequency of each distinct element, we just need to find the starting and ending position for each distinct element. for example, in the example test case above, the starting and ending position of element 2
is 0
and 2
, therefore its frequency is 2 - 0 + 1 = 3
.
Starting position of each element will be :
0
for the first element in the array.endingPosition (of previous element)
+ 1: for subsequent elements in the array. For example, the starting position of4
in the array above will beendingPosition (of 2) + 1 = 3
For calculating the ending positions, we can use modified binary search as shown in the code below:
Implementation
#include <iostream> #include <vector> using namespace std; // Modified binary search method which returns the last position of the element b/w [first, vec.size() - 1] for element key int findLastPosition(vector<int>& vec, int first) { int key = vec[first]; int lo = first, hi = vec.size() - 1; int mid = lo + (hi - lo)/2; while (lo <= hi) { mid = lo + (hi - lo)/2; if (vec[mid] > key) { hi = mid - 1; } else if (vec[mid] <= key) { lo = mid + 1; mid = lo; } } return mid - 1; } // Function which loops over every distinct element and finds its last position void printFrequency(vector<int>& vec) { int firstPosition = 0; while (firstPosition < vec.size()) { int lastPosition = findLastPosition(vec, firstPosition); cout << vec[firstPosition] << ": " << lastPosition - firstPosition + 1 << "\n"; firstPosition = lastPosition + 1; } } int main() { vector<int> arr = { 2, 2, 2, 4, 5, 6, 6, 7, 7, 7 }; printFrequency(arr); return (0); }
Time Complexity
Total time complexity of function findLastPosition
is O(log(n))
(since it is just modified binary search). The number of times it will be called is m
(m
is the no of distinct elements in the array). Therefore, the total time complexity of this will be O(mlog(n))