Binary Searches
Binary Search
First: What is a 'binary search'?
The task: search through a sorted list to find the specified value.
A linear search will start at the beginning and iterate through the list until it finds it. Depending on the size of the list and where in the list the item is, this isn't most efficient use of time.
A binary search can save time. The binary search seeks the midpoint of the list - (last + first)//2. And then checks if that is the item.
- If it is, great. The search is done.
- If it is greater than the search item, then we search through the upper half of the list.
- If it is less than the search item, then we search through the lower half of the list.
In other words, it searches through half the list, and then again through half of that list, and again... until it finds what it is looking for. Or, there is nothing left to search through and the item is not found. In other words, the list it searches through gets smaller and smaller.
def binary_search(alist, search_for):
first = 0
last = len(alist) -1
found = False
while found == False and first <= last:
midpoint = (last + first)//2
if alist[midpoint] == search_for:
found = True
return 'found'
elif search_for > alist[midpoint]:
first = midpoint +1
else:
last = midpoint -1
return 'not found'
Important note:
midpoint = (last + first)//2
. This keeps track of the correct index from the original alist
, not the segment of the list that it is now searching through. In this way, when we have indeed found what we were searching for, we know its index in the original alist
.
(the //2
is floor division - where the answer is rounded down. Read more about floor division at PEP 238)
So say the list is as below:
alist = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21]
and we are searching for 7
print binary_search(alist, 7)
This is what is happening as we run through the binary search
round # | first | last | search list * | midpoint * | alist[midpoint] | search_for |
---|---|---|---|---|---|---|
* The search list is referring to a segment of the original alist
- beginning from the index of first
and ending at the index of last
. The midpoint
is referring to the index of the original list, not the segment it is searching.
When we first begin the search,
- The segment of
alist
it will search through is:[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
first = 0
last = len(alist) -1
-->last = 10
midpoint = (last + first)//2
-->(10 + 0)//2
-->midpoint = 5
alist[midpoint] = 11
Since the 7
that we are searching for is less than the 11
at the midpoint, the next run of the search will be on the lower half of the list. So we have to change the last
to reflect that. The last
will be 1 less than the midpoint (since we already know that the midpoint is not what we are searching for).
In the next round of the search,
- The segment of
alist
it will search through is:[1, 3, 5, 7]
first = 0
. We are still searching from the beginning of the list.last = midpoint -1
-->last = 4
midpoint = (last + first)//2
-->(4 + 0)//2
-->midpoint = 2
alist[midpoint] = 5
Since the 7
that we are searching for is greater than than the 5
at the midpoint, the next run of the search will be on the upper half of the previous list. So we have to change the first
to reflect that. The first
will be 1 more than the midpoint (since we already know that the midpoint is not what we are searching for).
In the next round of the search,
- The segment of
alist
it will search through is:[7]
- first = midpoint +1 -->
first = 3
. last = 4
. * It is still4
from previous round.midpoint = (last + first)//2
-->(4 + 3)//2
-->midpoint = 3
alist[midpoint] = 7
And we have found the 7
when the midpoint = 3
. In other words, when the midpoint was representing the index at position 3
. So if we asked for alist[3]
, we get our 7
that we were searching for.