Binary Search: How to Reduce a Huge Problem to Half in Just a Few Steps

The Real Problem: Finding Something Fast

Have you ever stopped to think about how your database finds a specific ID among millions of records in milliseconds? Or how a paper dictionary works? You don’t start on page “A” looking for the word “Zebra”. You open the book in the middle, realize you’ve gone too far, and adjust your search.

The Production Scenario: Imagine you’re maintaining a microservice that validates access tokens in a blacklist. If that list has 1 million entries and you use a linear search (checking one by one), the system can take seconds to respond under load. With Binary Search, the same process takes the blink of an eye.

What Is Binary Search?

Binary Search is a “Divide and Conquer” algorithm. The idea is simple but powerful: in a sorted list, you always test the middle element.

Why Is Simple Search Slow?

In linear search (checking item by item), if you have 1 million elements, in the worst case, you’ll make 1 million checks. If the target is the last item in a list of 4 billion, the processor will execute 4 billion comparisons.

The Efficiency of Binary Search

Binary Search reduces the search space by half with each iteration. This means you need far fewer attempts:

100 items: Maximum of 7 attempts
1,000,000 items: Maximum of 20 attempts
4,000,000,000 items: Only 32 attempts!

It’s like guessing a number between 1 and 100: instead of trying 1, 2, 3, 4… you always guess the middle (50), and if it’s higher, you try 75, and so on.

Where to Use Binary Search?

Applying this algorithm requires an important criterion, but offers massive performance gains.

Real-World Use Cases

Database Systems: Essential for the operation of indexes in databases, allowing records to be found quickly.

Autocomplete: Fast search in sorted word dictionaries to suggest words as you type.

Git Bisect: Finding which commit introduced a bug in a history of hundreds of versions.

Validation APIs: Checking if a token is in a blacklist of millions of entries.

When NOT to Use

Unsorted Lists: Never use Binary Search on unordered data. The cost of sorting the list just to do a single search is greater than doing a linear search.

Linked Lists: Binary Search requires fast access to any position (like in an Array). In linked lists, accessing the “middle” is slow, invalidating the algorithm’s gain.

Practical Implementation in Kotlin

Let’s implement a generic binary search that works with any type of sorted data:

/**
 * Generic Binary Search Implementation.
 * Requirement: The list must be sorted according to the natural order of T.
 */
fun <T : Comparable<T>> binarySearch(list: List<T>, target: T): Int? {
    var low = 0
    var high = list.size - 1

    while (low <= high) {
        // Safe way to calculate mid to avoid overflow in large lists
        val mid = low + (high - low) / 2
        val guess = list[mid]

        when {
            guess == target -> return mid // Element found
            guess > target -> high = mid - 1 // Target is in lower half
            else -> low = mid + 1 // Target is in upper half
        }
    }

    return null // Element doesn't exist in the list
}

fun main() {
    // Sorted list of numbers
    val sortedNumbers = (1..100).toList()
    val target = 75
    
    val result = binarySearch(sortedNumbers, target)
    
    when (result) {
        is Int -> println("Success: The number $target is at index $result.")
        null -> println("Error: The number was not found.")
    }
    
    // Example with strings (sorted dictionary)
    val words = listOf("banana", "orange", "apple", "grape", "pineapple").sorted()
    val targetWord = "apple"
    
    val wordIndex = binarySearch(words, targetWord)
    println("The word '$targetWord' is at index: $wordIndex")
}

How the Code Works

We define the limits: We start with low = 0 (start of list) and high = size - 1 (end of list).
We calculate the middle: We always take the middle element of the area we’re searching.
We compare:
- If the middle element equals the target, we found it!
- If it’s greater, the target is in the lower half (we reduce high)
- If it’s smaller, the target is in the upper half (we increase low)
We repeat: We continue until we find it or until low becomes greater than high (doesn’t exist).

Important detail: We use low + (high - low) / 2 instead of (low + high) / 2 to avoid overflow problems in very large lists.

Comparison: Linear Search vs. Binary Search

Characteristic	Linear Search	Binary Search
Speed	Slow (O(n))	Fast (O(log n))
Requirement	Works with any list	List must be sorted
1 million items	Up to 1 million comparisons	Only 20 comparisons
4 billion items	Up to 4 billion comparisons	Only 32 comparisons

Why This Matters

Performance: The difference is huge in large-scale systems. For 1 million items, binary search is 50,000 times faster!

Maintenance Cost: The only “cost” is keeping the list sorted. Insertions and deletions in sorted lists are slower (O(n)), but if you do many searches, it’s very worth it.

Scalability: It’s the “gold standard” algorithm for search scalability. Response time increases almost imperceptibly while data grows exponentially.

Important Lessons

The list needs to be sorted: Binary Search only works if your data is sorted. If not, sort first or use linear search.
Radical efficiency: You can find any item in a list of billions of elements in less than 35 steps.
Details matter: Small implementation details, like safely calculating the middle index, separate junior code from resilient senior code.
Use libraries when possible: In Kotlin/JVM, prefer using Collections.binarySearch() for production code, as it’s already highly optimized.

Conclusion

Binary Search is one of the most powerful tools a developer can have in their arsenal. If you have data that changes little and need extreme read speed, Binary Search combined with prior sorting is your best option.

Remember: the simplest solution isn’t always the best. Sometimes, investing a little time to sort your data can result in massive performance gains!

This article is part of the “Understanding Algorithms” series, based on the book by Aditya Y. Bhargava.

Série: Understanding Algorithms