Last index of multiple keys using binary-search?

1

2

i have multiple occurrences of a key in a sorted array, and i want to perform binary search on them, a normal binary search returns some random index for the key having multiple occurrences, where as i want the index of the last occurrence of that key.

int data[] = [1,2,3,4,4,4,4,5,5,6,6];
int key = 4;
int index = upperBoundBinarySearch(data, 0, data.length-1, key);

Index Returned = 6

rykhan

Posted 2013-01-19T14:35:25.493

Reputation: 119

6Java and C++ are different languages, which are you interested in? – Oliver Charlesworth – 2013-01-19T14:35:56.687

Is int data[] = [1,2,3,4,4,4,4,5,5,6,6]; correct in Java? I don't think so. – Nawaz – 2013-01-19T14:36:54.290

1

For C++, there are many standard algorithms you could try.

– Some programmer dude – 2013-01-19T14:38:27.400

In general the same question: first-occurrence-in-a-binary-search

– MrSmith42 – 2013-01-19T15:11:47.780

1yes, its java syntax, its better to be in java – rykhan – 2013-01-19T19:47:41.200

Answers

1

Well, thanks to all especially @Mattias, that algo sounds good. anyway i have done with my own, that seem me to give better result, but if some one can help me to measure out the complexity of both algos mine and @Mattias, or any one has some better solution, it welcome..... anyhow here is the solution i found for the problem,

int upperBound(int[] array,int lo, int hi, int key)
{
    int low = lo-1, high = hi;
    while (low+1 != high)
    {
        int mid = (low+high)>>>1;
        if (array[mid]> key) high=mid;
        else low=mid;
    }
    int p = low;
    if ( p >= hi || array[p] != key )
        p=-1;//no key found
    return p;
}

this is for first occurrence, i also update the same with one other similar post First occurrence in a binary search

int lowerBound(int[] array,int lo, int hi, int key)
{
    int low = lo-1, high = hi;
    while (low+1 != high)
    {
        int mid = (low+high)>>>1;
        if (array[mid]< key) low=mid;
        else high=mid;
    }
    int p = high;
    if ( p >= hi || array[p] != key )
        p=-1;//no key found
    return p;
}

rykhan

Posted 2013-01-19T14:35:25.493

Reputation: 119

1

So basically, you took Bentley's method from that article and slightly rewrote it to find the last occurrence instead of the first? Nice work, although I have to agree with the author of that article: I also find that algorithm to be trickier and harder to understand. For example, I don't see when and how the final check for 'no key found' is needed.

– Mattias Buelens – 2013-01-19T20:11:36.623

yes, i found the algo for lower index from bentley's, and i just slightly modify it to suite my needs. yes its little trickier, in case of lowere index, it store the last matched in high variable in loop, but at the end it ensures that the last store value was actually a match, as it store in both cases >= – rykhan – 2013-01-19T20:25:01.837

@MattiasBuelens that final check is needed. Think of cases when array is empty. – nawfal – 2014-06-16T04:53:24.090

1

Seems that upperBound(x, 0, 1, 3) fails, when x = [2, 3], no? See: https://ideone.com/qoYsfB

– Dominic Farolino – 2018-10-23T01:21:53.133

@DominicFarolino actually the name of variables used here confuses, its not actually the real indexes of array here, instead these are the lower bounds which starts from 1 and upper bound which is equalant to the number of elements in array. see https://ideone.com/Y9zoxd anyways i am updating the names here to avoid misundersating. thanks.

– rykhan – 2018-11-23T16:39:17.117

6

The Java implementation in this answer finds the first occurrence of a key. There's a comment about how this could be changed to find the last occurrence, but the suggestion results in an infinite loop. The idea seems sound, though.

EDIT: After some research, I found a neat solution on The Algo Blog. Since the first found match is not necessarily the needed one, you need to keep track of the "best" match so far. When you do get a match, you store it and continue with the binary search on the right of that match (low = mid + 1).

public static int binarySearch(int[] a, int key) {
    return binarySearch(a, 0, a.length, key);
}

private static int binarySearch(int[] a, int fromIndex, int toIndex,
        int key) {
    int low = fromIndex;
    int high = toIndex - 1;
    int found = -1;

    while (low <= high) {
        int mid = (low + high) >>> 1;
        int midVal = a[mid];

        if (midVal < key) {
            low = mid + 1;
        } else if (midVal > key) {
            high = mid - 1;
        } else {
            found = mid;
            // For last occurrence:
            low = mid + 1;
            // For first occurrence:
            // high = mid - 1;
        }
    }
    return found;
}

This change keeps the O(log n) complexity. Still, the actual performance depends on the application. When the length of the array is much larger than the amount of duplications of the sought key, a linear search for the last occurrence may be faster. When there are a lot of duplications though, this modified binary search is probably preferable.

Mattias Buelens

Posted 2013-01-19T14:35:25.493

Reputation: 15 203

That is a clever idea. – MrSmith42 – 2013-01-19T15:04:26.283

I found a working solution, updated my answer. – Mattias Buelens – 2013-01-19T15:17:57.080

2+1 for the idea, slight modification to the last line of code could return the so called insertion point : return (found != - 1) ? found : -(low + 1); – Shmil The Cat – 2014-04-20T11:43:44.440

2

Presumably you want an O(log N) solution? (Otherwise you could just do a linear search.)

In C++, one possibility (out of several), is to use std::upper_bound. This will give you an iterator to the first element greater than what you asked for, so then you need to check the previous element. This is indeed O(log N).

I don't know if Java offers this a standard library method. However, the pseudocode for upper_bound is given in the link above, and should be easy enough to reimplement.

Oliver Charlesworth

Posted 2013-01-19T14:35:25.493

Reputation: 224 537

yes, i tried to convert that upper_bound in java, but it always gives me the upper_bound_index+1, and have some issue with my conversion, any how i have 10K entries in array, and i want the upper and lower index of same key having multiple occurrences, with some better solution than linear, binary will work fine with me as it provides me atleast O(log2N) solution – rykhan – 2013-01-19T19:52:05.767

0

When you find the key. instead of returning it do sequential search on the array to get the last one. This will be O(N) solution.

blackmath

Posted 2013-01-19T14:35:25.493

Reputation: 171

that will not fit for me, anyways thanks – rykhan – 2013-01-19T19:54:57.670

what is your desired complexity? You can do some play inside. let me know your constraints – blackmath – 2013-01-20T15:09:15.783

@RizwanYasin This is a decent solution. The complexity is not O(n), but O(log n + k) – nawfal – 2014-06-16T04:55:24.953

-2

In the binary search you compare your key to elements of the array data[i]. To get the last matching index you should change your compare function so that it gives inequality even if key is equal to data[i] and also to data[i+1].

int upperBoundBinarySearch(int data[],int start, int end, int key) {
  while(start < end) {
    int middle = start + (end-start)/2;
    if (data[middle] == key && (middle == end || data[middle+1] != key))
      return middle;
    if (data[middle] > key)
      end = middle;
    else {
      if (start == middle)
        return start;
      start = middle;
    }
  }
  return start;
}

Emanuele Paolini

Posted 2013-01-19T14:35:25.493

Reputation: 7 229

This won't work, and isn't even possible with the majority of search frameworks. – Oliver Charlesworth – 2013-01-19T14:54:12.437

added working snippet... – Emanuele Paolini – 2013-01-20T14:49:46.867