Binary Search

February 28, 2023 β€’ 20 min readΒ β€’Β OpenΒ 

The main idea behind binary search is linear reduction of search space. We'll elaborate on this below.

A Small Classical Example

Let's start with the classical example of binary search. Suppose you have an array [1,7,9,12,19][1, 7, 9, 12, 19], and you're asked to check if 77 is a member of this array or not (and also return its 0-indexed position in this array). A naive approach would be to iterate over all elements, and check if any element is 77 or not. This would take 5 iterations.

Now note that this array is sorted. So if I check some element at position pp in this array, we can have one of three possible outcomes:

  • The element is equal to 77: this means that we are done, and we can terminate our search!

  • The element is less than 77: since the array is sorted, all elements with position <p< p are less than 77 as well. So it is useless to look at any element with position ≀p\le p now.

  • The element is more than 77: in a similar manner to the previous case, it is useless to look at any element with position β‰₯p\ge p now.

So we could do something like this: initially our search space is the range of indices [0,4][0, 4]. Let's try looking at the middle element of our search space. If we do that, we end up in one of two scenarios:

  • We terminate if the element at that position is 77

  • We reduce the search space to at most half of the original search space each time.

For a concrete example, let's look at the middle element of the search space, which is 99. Since this is more than 77, it is useless to look at indices in the range [2,4][2, 4]. So our search space reduces to [0,1][0, 1]. There are two midpoints of this array, let's just use the left one for the sake of uniformity. The element we are now looking at is 11, which is less than 77. So it is useless to look at indices in the range [0,0][0, 0]. So our search space becomes [1,1][1, 1]. The middle position is just 11, and since the element there is 77, we can return the position 11 as the answer.

What if there was a 44 instead of 77 in the original array? In the last step of the above dry-run of the algorithm, we would note that it is useless to look at indices in the range [0,1][0, 1], so the search space becomes empty! When the search space becomes empty, that means we haven't found it yet. So we should return some special value indicating that the search function resulted in a failure. We'll use the size of the array as the return value for now (we will see why we took this choice later on).

An implementation might look like the following:


_14
int find_position(const vector<int>& a, int x) {
_14
int l = 0;
_14
int r = (int)a.size() - 1; // [l, r] is our search space
_14
while (l <= r) { // search space is non-empty
_14
int m = l + (r - l) / 2; // "middle" position in the range
_14
if (a[m] == x) return m; // found!
_14
else if (a[m] < x) {
_14
l = m + 1; // remove all indices <= m from the search space
_14
} else {
_14
r = m - 1; // remove all indices >= m from the search space
_14
}
_14
}
_14
return n; // failure
_14
}

Is this more efficient than a linear search? Note that at each step, we reduce the search space by at least half, so in 1+log⁑2n1 + \log_2 n iterations (where nn is the size of aa), the search space size reduces to <1< 1, and since the size is an integer, it becomes 00. It's easy to see how it terminates.

A More General Idea

More often than not, binary search is not used in such a simple use-case in competitive programming (or in real life).

Let's have a closer look at what we were doing in the above algorithm. We were using some kind of ordering (the ordering of integers, in our example) to discard a large part of the search space (in fact, we discarded about half the search space each time).

How do we generalize this further? More importantly, what will be the statement of the generalized problem?

Firstly, we will do something that will make our original problem easier to generalize. Rather than checking if an element is present in the array, we can find the position of the first element that is greater than or equal to the element (if no such element exists, we'll return the size of the array).

Clearly, this is even stronger than our original problem. If the answer to this problem is the index ii, we can check if this is the right answer by checking if i<ni < n and a[i]=xa[i] = x. If this condition isn't true, we don't have any occurrence of xx in aa.

Let's try to do away with the ordering, and see how far we go.

Let's mentally (not in code) construct an array bb, where the value of b[i]b[i] is true if and only if a[i]<xa[i] < x.

How does bb look like? Clearly, some (possibly empty) prefix of it is filled with "true", and the remaining (possibly empty) suffix is filled with "false".

Then our problem reduces to finding the position of the first false (nn if not found) in this kind of an array bb. For now, forget all about aa and xx, and focus only on bb.

It turns out, that for any such array bb, we can easily find the first false using the same idea of discarding parts of the search space.

Let's start with some notation. [l0,r0][l_0, r_0] will be the interval we need to search (for our problem it is [0,nβˆ’1][0, n - 1]). l,rl, r will be indices we shall maintain at each step such that the range of indices [l0,l][l_0, l] in bb consists only of "true" values, and the range of indices [r,r0][r, r_0] in bb consists only of "false" values.

Initially, we have zero information about any element of bb, so it is better to force these ranges to be empty ranges. We can do so by setting l=l0βˆ’1l = l_0 - 1 and r=r0+1r = r_0 + 1 initially. We will try to increase ll and decrease rr so that these two ranges cover the whole search space. So, in the end, ll will correspond to the location of the last true, and rr will correspond to the location of the first false, and we can simply return rr!

Let's suppose that at some point, rβˆ’l>1r - l > 1. This means that there is at least one element between the indices ll and rr that we haven't put in one of these intervals yet. Let's take an index roughly in the middle of this range (l,r)(l, r), say m=⌊(l+r)/2βŒ‹m = \lfloor(l + r) / 2\rfloor.

  • If b[m]b[m] is true, then we know that everything to the left of b[m]b[m] is true as well, so we can increase the range [l0,l][l_0, l] to [l0,m][l_0, m] (which is equivalent to replacing ll by mm).

  • Otherwise, it is false, so everything to the right of b[m]b[m] is false as well. So we can increase the range [r,r0][r, r_0] to [m,r0][m, r_0].

Each time, we reduce the size of the unexplored range from rβˆ’lβˆ’1r - l - 1 to rβˆ’mβˆ’1r - m - 1 or mβˆ’lβˆ’1m - l - 1, which corresponds to a reduction by at least half. So the number of iterations is at most log⁑2(r0βˆ’l0+1)+1\log_2(r_0 - l_0 + 1) + 1.

An implementation of this would look something like this:


_13
int find_first_false(const vector<bool>& b) {
_13
int l = -1;
_13
int r = (int)b.size();
_13
while (r - l > 1) {
_13
int m = l + (r - l) / 2;
_13
if (b[m]) {
_13
l = m;
_13
} else {
_13
r = m;
_13
}
_13
}
_13
return r;
_13
}

Note that this terminates by our previous observation.

But, we said earlier that we won't really construct bb in our code, right? How do we use this same algorithm for solving our own problem?

Note that, by what we said earlier, b[i]b[i] is just defined as the truth value of a[i]<xa[i] < x, so computing it on the fly is no big deal. So, if we replace b[m]b[m] with a[m]<xa[m] < x, that would solve our problem.

That was quite a handful, so let's see a brief summary of what we did:

  • Rather than explicitly reason using the order < and the value x, we constructed an array bb of some very specific form (the first few things in it being true and the rest being false), and found the location of the first false in b.

This very clearly points to our desired generalization:

The Generalization

Suppose we are given the following:

  • [l,r][l, r]: a range of integers (in our example, [0,nβˆ’1][0, n - 1])

  • ff: a function from integers to booleans, which satisfies the following property: there exists some integer tt such that for all l≀x<tl \le x < t, f(x)f(x) is true, and for all t≀x≀rt \le x \le r, f(x)f(x) is false.

Then if the time taken to compute ff is upper bounded by T(l,r)T(l, r), we can find the value of tt (i.e., the first false index) in O(T(l,r)log⁑2(rβˆ’l+1))+O(1)O(T(l, r) \log_2 (r - l + 1)) + O(1) time.

Such an ff is called a predicate. In the problem we discussed above, ff is called a predicate.

An implementation of this function will be something like the following (ignoring overflow errors):


_14
template <class Integer, class F>
_14
Integer find_first_false(Integer l, Integer r, F&& f) {
_14
--l;
_14
++r;
_14
while (r - l > 1) {
_14
Integer m = l + (r - l) / 2; // prefer std::midpoint in C++20
_14
if (f(m)) {
_14
l = m;
_14
} else {
_14
r = m;
_14
}
_14
}
_14
return r;
_14
}

Note that this implementation also has the nice property that ll is the position of the last true in the corresponding array bb, so you can define a function similar to this one that returns ll instead.

To use this function to implement our original binary search, we can do something like the following:


_10
int find_position(const vector<int>& a, int x) {
_10
auto f = [&](int i) {
_10
return a[i] < x;
_10
};
_10
int n = (int)a.size();
_10
int pos = find_first_false(0, n - 1, f);
_10
if (pos == n || a[pos] != x) return n;
_10
return pos;
_10
}

Note that this abstraction also gives us the following result: we don't really need aa to be sorted at all. The only thing we need is that everything less than xx in aa should be in a prefix, and everything not less than xx should be in the remaining suffix and if xx is in the array, the beginning of that suffix should have xx at it. This definition also handles possible duplicates of xx in the array.

As we'll see later on, this kind of an abstraction allows us to solve a whole variety of problems.

Exercise: What would you do if in bb, the first few elements were false, and the rest were true?

Binary Searching on the Answer

Sometimes, it is much easier to deal with bounds on the answer rather than the exact answer itself.

In other words, sometimes it is much easier to construct a function ff that returns true iff the input is β‰₯\ge the answer to the problem, by running some algorithm that returns a boolean value not by computing the answer, but by considering some properties of the problem at hand and the input.

To make this more concrete, let's consider this problem.

In short, you're given an nΓ—mn \times m multiplication table, and you want to find the kthk^\mathrm{th} smallest entry in this table (i.e., if you were to sort all the entries, find the entry at position kk).

It is not clear how to directly find this value. But given a "guess" xx, we can see if this is at least the answer or not:

Let's find the number of integers smaller than xx in each row, and sum them up. This can be done by a simple division for each row.

If the total numbers less than xx are <k< k, we will return true, otherwise, we will return false. This predicate works since the number of entries smaller than xx is a non-decreasing function of xx (more commonly, we compare at f(x),f(x+1)f(x), f(x + 1) and try to argue that if f(x)f(x) is false, then f(x+1)f(x + 1) is also false), and hence the last element that makes the function return true is in fact the kthk^\mathrm{th} smallest entry.

This can be found using binary search as we have discussed above.

We'll discuss two other ways of implementing usual binary search, and why they seem intuitive to people who use them.

We'll refer to the previous implementation as the (l,r)(l, r) way, since the invariant in that was equivalent to saying that the remaining search space will always be (l, r).

This section is just to help out people in reading others' binary search implementations, as well as choose between implementations to find the way you find best. I prefer using the implementation used above, so I'll explain the other two implementations in the context of that, with some intuition as to why people choose to implement things that way.

The [l,r][l, r] Way

In this type of an implementation, we maintain l+1l + 1 and rβˆ’1r - 1 rather than ll and rr. The reason behind this is something like the following (independent of the above reasoning):

[l,r][l, r] consists of the currently explored search space. Let's maintain an extra integer (ansans), which stores the "best" answer found so far, i.e., the leftmost false found so far.

This interval is non-empty if rβ‰₯lr \ge l. The midpoint is the same as before, i.e., m=⌊(l+r)/2βŒ‹m = \lfloor(l + r) / 2\rfloor. If f(m)f(m) evaluates to false, then the best possible answer so found has to be mm, and we reduce the search space from [l,r][l, r] to [l,mβˆ’1][l, m - 1]. Otherwise, we reduce it to [m+1,r][m + 1, r], and do not update the answer.

Note that here we would also need to maintain a separate variable ansans, and initialize it appropriately, depending on whether you want the first false (as done above), or the last true (which can be done by moving the update of the ansans variable across the if-branches), which is also why I haven't used it for a very long time. However, people who prefer closed intervals and this explanation seem to gravitate towards this implementation.

The invariant here remains that if lβ€²,rβ€²l', r' are the l,rl, r in the old implementation, and l,rl, r are the l,rl, r in this implementation, then lβ€²=lβˆ’1l' = l - 1 and ans=rβ€²=r+1ans = r' = r + 1. In the case where you need to find the last true, the invariant becomes ans=lβ€²=lβˆ’1ans = l' = l - 1 and rβ€²=r+1r' = r + 1.

An implementation is as follows:


_14
template <class Integer, class F>
_14
Integer find_first_false(Integer l, Integer r, F&& f) {
_14
Integer ans = n;
_14
while (l <= r) {
_14
Integer m = l + (r - l) / 2; // prefer std::midpoint in C++20
_14
if (f(m)) {
_14
l = m + 1;
_14
} else {
_14
ans = m;
_14
r = m - 1;
_14
}
_14
}
_14
return ans;
_14
}

The [l,r)[l, r) Way

In this type of an implementation, we maintain l+1l + 1 and rr instead of ll and rr. The interpretation is that the search space is in [l,r)[l, r), and people who like using half-open intervals tend to prefer this implementation. The rationale behind this is that when the search space becomes empty, it corresponds to the range [ans,ans)[ans, ans) (if you're trying to find the first false, of course). The structure of the intervals doesn't always correspond in this implementation with the other implementations, since the value of mm used is usually slightly different (and usually equal to ⌊(l+r)/2βŒ‹\lfloor(l + r) / 2\rfloor where [l,r)[l, r) is the search space at that point, and in the case that there are two range midpoints, this is the one to the right and not to the left).

It's much more illuminating to think of it in this way: Suppose you have a range [l,r)[l, r), and you need to insert a false just after the last true in the range. What will be the position of the new false?

Another way of looking at it is: everything in the range [l,ans)[l, ans) corresponds to true, and everything in the range [ans,r)[ans, r) corresponds to true. So ansans is some sort of a "partition point" for the array. We will see later on how this exact interpretation is used in an implementation of this function in C++'s STL.

Let's check the midpoint of the range [l,r)[l, r). If f(m)f(m) is true, we will never put it at a location ≀m\le m, so ll needs to be updated to m+1m + 1. Otherwise, we have to put it at a position ≀m\le m, so the rr needs to be updated to mm.


_13
template <class Integer, class F>
_13
Integer find_first_false(Integer l, Integer r, F&& f) {
_13
++r;
_13
while (l < r) {
_13
Integer m = l + (r - l) / 2; // prefer std::midpoint in C++20
_13
if (f(m)) {
_13
l = m + 1;
_13
} else {
_13
r = m;
_13
}
_13
}
_13
return r;
_13
}

This and the next sections will be much more terse than the previous section, since we will be doing more of an overview of methods instead of explaining something from scratch.

Consider the following implementation:


_11
template <class Integer, class F>
_11
Integer find_first_false(Integer l, Integer r, F&& f) {
_11
if (l > r) return r + 1;
_11
++r;
_11
Integer w = Integer(1) << __lg(r - l);
_11
--l;
_11
if (f(l + w)) l = r - w;
_11
for (w >>= 1; w >= Integer(1); w >>= 1)
_11
if (f(l + w)) l += w;
_11
return l + 1;
_11
}

Here, we try to ensure that the interval sizes are always powers of 2. Why we do so will be explained in the section on optimizing binary search.

After we ensure that the interval sizes are always powers of 2, we can just try to reconstruct the binary representation of ansβˆ’1βˆ’lans - 1 - l, where ansans is what we need to return. For that, we can try to go from the highest bit to the lowest bit, and greedy works here for the same reason that binary representations are unique; adding a higher power leads to the predicate being false.

Language Specific Ways for Binary Searching

Some of the most used languages in competitive programming fortunately have some inbuilt functions (albeit limited) to do binary search for you.

C++

  • binary_search: This function returns a bool that tells you if there is an element equal to the queried element between two iterators (with an optional comparator).

  • lower_bound, upper_bound: If the range occupied by elements comparing equal to xx is defined by iterators [itl,itr)[it_l, it_r) in a given range of iterators [inputitl,inputitr)[input_it_l, input_it_r), then lower_bound and upper_bound return itlit_l and itrit_r. In other words, lower_bound(it1, it2, x, comp) returns the iterator to the first element in the range of iterators [it1,it2)[it1, it2) which is β‰₯x\ge x according to compcomp (which is optional and defaults to the usual comparison), and upper_bound does the same for >x> x instead.

  • equal_range: This returns both lower_bound and upper_bound.

  • partition_point: This works exactly like the binary search function we wrote in the [l,r)[l, r) way. In C++20, with ranges::views::iota, we can use it to do binary search on the answer as well.

Python

  • The bisect module has useful functions like bisect, bisect_left, bisect_right that do similar things to lower_bound and upper_bound.

Java

  • Collections.binarySearch and Arrays.binarySearch do something similar to finding an index/element (not necessarily the first equal or the last equal) using binary search.

In our first implementation, we were returning early when we found the element. Sometimes, it's faster to stop the binary search early, and in those cases, this is sometimes a constant factor optimization.

The binary lifting implemented for binary search above is also a constant factor optimization on some architectures if unrolled manually (which is possible due to the simple nature of the increments, and the possibility of hardcoding the constants), according to John Bentley's Programming Pearls. It is an improvement in some cases, where the locations you binary search on are few in number and repeated in successive binary search applications, leading to better cache usage. An example of doing that optimization to the extreme would be to implement multiple functions (each with a different power of 2), and store the function pointers in an array, and use computed jumps (for instance, by computing the power of 2 needed) to get to the correct function at runtime, which still leads to some speedup on certain architectures.

For instance, for certain types of queries (where either ll or rr are more or less the same), the speedup is pretty significant. However, for other types of queries, the simpler version is more efficient. To show the kind of speedups you can get for certain types of queries, I did some benchmarks on queries where the left bound is always the same. For the benchmarking code below, the results are as follows:

Results:


_13
-----------------------------
_13
Simple binary search
_13
744175
_13
Time: 2184 ms
_13
-----------------------------
_13
Binary lifting
_13
744175
_13
Time: 1504 ms
_13
-----------------------------
_13
Binary lifting with unrolling
_13
744175
_13
Time: 1407 ms
_13
-----------------------------

https://algorithmica.org/en/eytzinger explains quite a nice method of exploiting the cache and speeding up binary search in a certain setup by a pretty impressive factor.

https://codeforces.com/blog/entry/76182 explains a variation of binary search which divides the ranges in an uneven order, which can lead to a change at the complexity level as well.

Other Resources

As promised, here's a collection of resources that I felt are pretty high-quality, that pertain to topics related to binary search.

A great resource is the Codeforces EDU tutorial (and problems) on binary search.

A great video that also explains binary search is linked in the following blog: https://codeforces.com/blog/entry/67509.

There is also an extension to vanilla binary search, called the parallel binary search, and https://codeforces.com/blog/entry/89694 links to a great video tutorial for this extension.

Binary lifting is a pretty general idea. A tutorial on binary lifting on fenwick trees can be found at https://codeforces.com/blog/entry/61364, a great one on binary lifting on trees can be found at https://usaco.guide/plat/binary-jump?lang=cpp (which also has links to other resources).

Binary search on segment trees is a quite useful technique too, and a working implementation (with some intuition on how it works) for recursive segment tree can be found at https://codeforces.com/blog/entry/83883. An implementation for the iterative segment tree can be found at https://github.com/atcoder/ac-library/blob/master/atcoder/lazysegtree.hpp.

A rough idea of how it works (based off a conversation where I was explaining this code to someone) is as follows. Some parts of it might not make sense, so please let me know if there's something unclear.

Please let me know if you find any bugs or typos in this blog!


← Iterative Binary Tree Traversals

Want more?

Subscribe to get my latest content via email. I won’t send you spam, and you can unsubscribe at any time.