Finding count of duplicate numbers in a sorted array using binary search


Algorithm for finding duplicates makes use of binary search to find the first occurrence and the last occurrence of the element to be searched.

  • Finding the first occurrence
    If the searched element located at index mid and its previous element (i.e at index mid-1) match, binary search continues in the sorted space to the left side of index mid i.e from index beg till index mid-1. This way the search continues until the previous element is the same as the element to be searched. When the search terminates we get the index of the first occurrence.

  • Finding the last occurrence
    If the searched element located at index mid and its next element (i.e at index mid+1) matches the searched value, the search continues in the sorted space to the right side of mid i.e from index mid+1 till index end. When the search terminates we get the index of the last occurrence.

      Thus, count of duplicate elements = Index of the last occurrence - Index of the first occurrence + 1


Example of finding the count of duplicate numbers using binary search Binary_Search_Duplicates

Time complexity of finding count of duplicate numbers using binary search in a sorted array : Log ( N )

Why is mid calculated as mid = beg + (end-beg)/2 ?

Java

Java program for finding the count of duplicate numbers within a sorted array using binary search.


Python

Python program for finding the count of duplicate numbers within a sorted array using binary search.


C++ program for finding the count of duplicate numbers within a sorted array using binary search.

#include<iostream>
#include<vector>

using namespace std;

int FirstOccurrence (vector<int> &, int);
int LastOccurrence (vector<int> &, int);

int FirstOccurrence (vector<int> &vec, int n) {

    int beg = 0;
    int end = vec.size() - 1;

    while (beg <= end) {

        int mid = beg + (end-beg)/2;

        if (vec[mid] == n) {
            if (mid-1 >= 0 and vec[mid-1] == n) {
               end = mid-1;
               continue;
            }
            return mid;
        }

        else if (vec[mid] < n)
            beg = mid + 1;
        else
            end = mid - 1;
    }
    return -1;
}

int LastOccurrence (vector<int> &vec, int n) {

    int beg = 0;
    int end = vec.size()-1;

    while (beg <= end) {

        int mid = beg + (end-beg)/2;

        if (vec[mid] == n) {
            if (mid+1 < vec.size() and  vec[mid+1] == n) {
               beg = mid + 1;
               continue;
            }
            return mid;
        }

        else if (vec[mid] < n)
            beg = mid + 1;
        else
            end = mid - 1;
    }
    return -1;
}

int main()
{
   vector<int> vec = { 1, 2, 3, 9, 9, 9, 9, 10, 10, 12, 13 };

   int first_index, last_index, n;
   cout << "Enter the number : ";
   cin >> n;

   first_index = FirstOccurrence (vec, n);
   last_index =  LastOccurrence (vec, n);

   if (first_index == -1 or last_index == - 1) {
       cout << "Element does not exist" << endl;
   } else {
       cout << "First Occourance of " << n << " is at index : " << first_index << endl;
       cout << "Last Occourance of " << n << " is at index : " << last_index << endl;
       cout << "Total Count : " << last_index - first_index + 1;
   }
   return 0;
}

Output of Binary Search implementation to find the count of duplicate elements in an array

Enter the number : 13
First Occourance of 13 is at index : 10
Last Occourance of 13 is at index : 10
Total Count : 1

Enter the number : 8
Element does not exist

Enter the number : 5
Element does not exist

Enter the number : 3
First Occourance of 3 is at index : 2
Last Occourance of 3 is at index : 2
Total Count : 1

Enter the number : 9
First Occourance of 9 is at index : 3
Last Occourance of 9 is at index : 6
Total Count : 4

Copyright (c) 2019-2020, Algotree.org.
All rights reserved.