Calculating mid in binary search

24

13

I was reading an algorithms book which had the following algorithm for binary search:

public class BinSearch {
  static int search ( int [ ] A, int K ) {
    int l = 0 ;
    int u = A. length −1;
    int m;
    while (l <= u ) {
      m = (l+u) /2;
      if (A[m] < K) {
        l = m + 1 ;
      } else if (A[m] == K) {
        return m;
        } else {
          u = m−1;
        }
       }
       return −1;
      }
 }

The author says "The error is in the assignment m = (l+u)/2; it can lead to overflow and should be replaced by m = l + (u-l)/2."

I can't see how that would cause an overflow. When I run the algorithm in my mind for a few different inputs, I don't see the mid's value going out of the array index.

So, in which cases would the overflow occur?

Bharat

Posted 2011-07-18T15:22:17.430

Reputation: 1 360

2adding, subtracting, multiplying 2 numbers all produce more bits, so obviously there's a chance of overflow – phuclv – 2016-01-28T05:30:56.443

Possible duplicate of binary search middle value calculation

– Jingguo Yao – 2017-01-03T06:12:47.383

Answers

34

This post covers this famous bug in a lot of detail. As others have said it's an overflow issue. The fix recommended on the link is as follows:

int mid = low + ((high - low) / 2);

// Alternatively
int mid = (low + high) >>> 1;

It is also probably worth mentioning that in case negative indices are allowed, or perhaps it's not even an array that's being searched (for example, searching for a value in some integer range satisfying some condition), the code above may not be correct as well. In this case, something as ugly as

(low < 0 && high > 0) ? (low + high) / 2 : low + (high - low) / 2

may be necessary. One good example is searching for the median in an unsorted array without modifying it or using additional space by simply performing a binary search on the whole Integer.MIN_VALUEInteger.MAX_VALUE range.

Jeff Foster

Posted 2011-07-18T15:22:17.430

Reputation: 34 129

2+1 for the interesting link. – Randy Howard – 2014-08-30T14:07:54.297

The link you provided has a clear explanation of the issue. Thanks! – Bharat – 2011-07-18T16:20:47.713

3

The problem is that (l+u) is evaluated first, and could overflow int, so (l+u)/2 would return the wrong value.

murgatroid99

Posted 2011-07-18T15:22:17.430

Reputation: 9 225

2

Jeff suggested really good post to read about this bug, here is summary if you want quick overview.

In Programming Pearls Bentley says that the analogous line "sets m to the average of l and u, truncated down to the nearest integer." On the face of it, this assertion might appear correct, but it fails for large values of the int variables low and high. Specifically, it fails if the sum of low and high is greater than the maximum positive int value (2^31 - 1). The sum overflows to a negative value, and the value stays negative when divided by two. In C this causes an array index out of bounds with unpredictable results. In Java, it throws ArrayIndexOutOfBoundsException.

Vipin

Posted 2011-07-18T15:22:17.430

Reputation: 2 182

2

The potential overflow is in the l+u addition itself.

This was actually a bug in early versions of binary search in the JDK.

Nemo

Posted 2011-07-18T15:22:17.430

Reputation: 55 666

the link is broken – jdhao – 2017-08-27T12:11:16.677

@jdhao - It was working at the time. Accepted answer has a link to a full account by the author of the buggy code. I have updated my link anyway. – Nemo – 2017-08-27T22:36:07.163