binary search middle value calculation

18

5

The following is the pseudocode I got from a TopCoder tutorial about binary search

binary_search(A, target):
   lo = 1, hi = size(A)
   while lo <= hi:
      mid = lo + (hi-lo)/2
      if A[mid] == target:
         return mid            
      else if A[mid] < target: 
         lo = mid+1
      else:
         hi = mid-1

   // target was not found

Why do we calculate the middle value as mid = lo + (hi - lo) / 2 ? Whats wrong with (hi + lo) / 2

I have a slight idea that it might be to prevent overflows but I'm not sure, perhaps someone can explain it to me and if there are other reasons behind this.

binsearch

Posted 2010-12-26T15:29:16.933

Reputation: 91

Answers

18

Yes, (hi + lo) / 2 may overflow. This was an actual bug in Java binary search implementation.

No, there are no other reasons for this.

zeuxcg

Posted 2010-12-26T15:29:16.933

Reputation: 7 326

13

Although this question is 5 years old, but there is a great article in googleblog which explains the problem and the solution in detail which is worth to share.

It's needed to mention that in current implementation of binary search in Java mid = lo + (hi - lo) / 2 calculation is not used, instead the faster and more clear alternative is used with zero fill right shift operator

int mid = (low + high) >>> 1;

vtor

Posted 2010-12-26T15:29:16.933

Reputation: 4 865

11

From later on in the same tutorial:

"You may also wonder as to why mid is calculated using mid = lo + (hi-lo)/2 instead of the usual mid = (lo+hi)/2. This is to avoid another potential rounding bug: in the first case, we want the division to always round down, towards the lower bound. But division truncates, so when lo+hi would be negative, it would start rounding towards the higher bound. Coding the calculation this way ensures that the number divided is always positive and hence always rounds as we want it to. Although the bug doesn't surface when the search space consists only of positive integers or real numbers, I've decided to code it this way throughout the article for consistency."

Abhijeet Kashnia

Posted 2010-12-26T15:29:16.933

Reputation: 4 825

6

It is indeed possible for (hi+lo) to overflow integer. In the improved version, it may seem that subtracting lo from hi and then adding it again is pointless, but there is a reason: performing this operation will not overflow integer and it will result in a number with the same parity as hi+lo, so that the remainder of (hi+lo)/2 will be the same as (hi-lo)/2. lo can then be safely added after the division to reach the same result.

Jeremy Elbourn

Posted 2010-12-26T15:29:16.933

Reputation: 2 053