Big O Notation

Algorithms and Data Structures From Zero to Hero

Kemil Beltre
9 min readSep 4, 2022
Sea turtle black and white illustration

Big O notation is a special notation that tells you how fast an algorithm is. Okay, but, Who cares? The point is that you’ll be using other people’s algorithms a lot — and when you do, it’s nice to understand how fast or slow they are.

In this section, I’ll explain in depth what Big O notation is and give you a list of the most common running times.

Terminology

The letter O is used because it describes the run time as being on the Order of something, which is a mathematician’s way of saying that two things grow similarly.

What is “Notation” mean?

A notation is a collection of related symbols that are each given an arbitrary meaning, created to facilitate structured communication within a domain of knowledge or field of study.

When talking about the big-O run time of a piece of code, the size of the input data is given a letter name, usually N or n. Then the big-O is an expression in N that says roughly how many steps the code will take to execute.

What is NOT Big O?

With Big O, we don’t measure the speed of an algorithm in seconds (or minutes). We measure the rate of growth of an algorithm in the number of operations it takes to complete.

Big O does NOT represent the best scenario. Big O establishes a worst-case time. So you can say that, in the worst case, you’ll have run time in order of “number of operations”.

Time and Space Complexity

Time complexity: This is what the concept of asymptotic runtime, or big O time, means.

Space Complexity: Time is not the only thing that matters in an algorithm, We might also care about the amount of memory — or space — required by an algorithm.

Space complexity is a parallel concept to time complexity. If we need to create an array of size n, this will require O(n) space. If we need a two-dimensional array of size n by n, this will require O(n²) space.

Binary search

Suppose you’re searching for a word in a dictionary. The word starts with Ka. You could start at the beginning and keep flipping pages until you get to the Ka, But you’re more likely to start at a page in the middle because you know the Ka is going to be near the middle of the dictionary.

Now suppose you log on to Email. When you do, the Email provider has to verify that you have an account on the site. So, it needs to search for your email address in its database. Suppose your email address is k-user@mail.com. The email provider could start from the A, and search for your name — but it makes more sense for it to begin somewhere in the middle.

This is a search problem. And all these cases use the same algorithm to solve the problem: binary search.

Binary search is an algorithm, its input is a sorted list. If an element you’re looking for in that list, the binary search returns the position where it’s located. Otherwise, the binary search returns null.

Let’s see an example of how binary search works. I’m thinking of a number between 1 and 100.

array of 100 elements from 1 to 100

You have to guess my number in the fewest tries possible. With every guess, I’ll tell you if you guess. is too low, too high, or correct.

Suppose you start guessing from 1 to 100, one at a time like this: 1,2,3…

This is a simple search. With each guess, you’re eliminating only numbers. If my number was 00, it could take you 99 guess to get there.

A better way to search

Here’s a better technique, Start with 50.

Too low, but just eliminated half the numbers! Now you know that 1–50 is too low. Next guess: 75.

Too high, but again you cut down half the remaining numbers! With binary search, you guess the middle number and eliminate half the remaining numbers every time. Next 63 (halfway between 50 and 75), etc. until getting to the right number.

This is a binary search. Here’s how many numbers you can eliminate every time.

Steps that takes the binary search in an array of 1 to 100. (7)

Whatever number I’m thinking of, you can guess in a maximum of seven guesses — because you eliminate so many numbers with every guess.

In general, for any list of n, binary search will take log2 n steps to run in the worst case, where simple search will take n steps

Logarithms

In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a given number x is the exponent to which another fixed number, the base b, must be raised, to produce that number x. In the simplest case, the logarithm counts the number of occurrences of the same factor in repeated multiplication; e.g. since 1000 = 10 × 10 × 10 = 103, the “logarithm base 10” of 1000 is 3, or log10 (1000) = 3. The logarithm of x to base b is denoted as logb (x), or without parentheses, logb x, or even without the explicit base, log x, when no confusion is possible, or when the base does not matter such as in big O notation.

In the series of articles, when I talk about running time in Big O notation, log always means log in base 2 (log2). When you search for an element using a simple search, in the worst case you might have to look at every single element. So for a list of 16 numbers, you’d have to check 16 numbers at most. For a binary search, you have to check log n elements in the worst case. For a list of 8 elements, log 16 == 24 == 16. So for a list of 16 numbers, you would have to check 5 numbers at most:

16 -> 9, 10, 11, 12, 13, 14, 15, 16 -> 13, 14, 15, 16 -> 15, 16 -> 16

Note: Binary search only works when your list is in sorted order. For example the words in a dictionary are sorted in alphabetical order, so you can use binary search for to look a word.

Algorithm running times grow at different rates

Ada is writing a search algorithm for NASA. Her algorithm will kick when a rocket is about to land on the Moon, and it will help calculate where to land.

This is an example of how the run time of two algorithms can grow at different rates. Ada is trying to decide between simple search and binary search. The algorithm needs to be both fast and correct. On one hand, binary search is faster. And Ada has only 12 seconds to figure out where to land — otherwise, the rocket will be off course. On the other hand, a simple search is easier to write, and there is less chance of bugs being introduced.

As Ada doesn’t want bugs in the code to land the rocket! To be extra careful, Ada decides to time both algorithms with a list of 100 elements.

Times simple search vs binary search

So as the list of numbers gets bigger, binary search suddenly becomes a lot faster than simple search. Ada thought the binary search was 15 times faster than a simple search, but that’s not correct. If the list has 1 billion items, it’s more like 33 million times faster. That’s why it’s not enough to know how long an algorithm takes to run — you need to know how the running time increases as the list size increases. That’s where Big O notation comes in.

Binary search needs log n operations to check a list of size n. What’s the runtime in Big O notation? It’s O(log n)

In general, Big O notation is written as follows:

O refers to Big O and (n) number of operations

Some common Big O run times

Here are five Big O run times that you’ll encounter a lot, sorted from fastest to slowest:

  • O(log n), also known as log time. Example: Binary search.
  • O(n), also known as linear time. Example: Simple search.
  • O(n * log n). Example: A fast sorting algorithm, like quicksort.
  • O(n²). Example: A slow sorting algorithm, like selection sort.
  • O(n!). Example: A really slow algorithm, like the traveling salesperson.

4 Important Rules to Understand Big O

1. Different steps get Added

This means if have two different steps in your algorithms you add up steps. So if you have a first step that takes O(a) and a second step that takes O(b) you would up those run times and you get O(a+b).

Example:

Function that do two step first one in O(a) and second one in O(b)

2. Drop Constants

Suppose that we want to print the min and the max element of an array, there are many way, in this case, we have two:

The first function finds the max element and then finds the min element, and the other one finds the min and max simultaneously. Now, these are fundamentally doing the same thing, they’re doing essentially the exact same operations, just doing them in slightly different orders.

Both of these get described as O(n) where n is the size of the array. Now it’s really tempting for people to sometimes see two different loops and describe it as O(2n) and it's important that you remember you drop constants in Big O.

You do not describe things as O(2n), O(3n) -> O(n) because you are looking for how things scale roughly. Is it a linear relationship, is it quadratic... You always drop constants.

3. Different inputs -> Different Variables

Big O is an equation that expresses how runtime changes, and how its scales.

Let’s an example where we have two arrays and we’re walking through them to figure out the number of elements in common between the two arrays:

Some people will mistakenly call this O(n²) but that’s not correct. When you just take about runtime if you describe things as O(n), O(log n), etc. n must have a meaning it’s not like things are inherently one on the other.

When you describe this as O(n²) doesn’t make sense because n it’s not the size of the array, if there are two different arrays. What you want to describe instead is O(a*b).

4. Drop non-dominate terms

What this simply means is when you have different blocks running at different times the Big O should be the worst scenario.

Function that do 2 step first one in O(n) and second one in O(n^2)

In this case, the quadratic runtime is more dominant than the linear runtime.

The Graph

Big O graph
Source: Wikipedia

Graphs of functions commonly used in the analysis of algorithms show the data (input size n) versus time (the number of steps N) for each function.

EXERCISES

Give the run time for each of these scenarios in terms of Big O.

  1. Suppose you have a sorted list of 136 names, and you’re through it using binary search. What’s the maximum number of steps it would take?
  2. Suppose you double the size of the list. What’s the maximum number of steps now?
  3. You have a name, and you want to find the person’s phone number in the phone book.
  4. You want to read the numbers of every person in the phone book.
  5. You want to read the numbers of just the As.

Recap

  • Drop the non-dominate terms, Big O establishes a worst-case time.
  • Drop constants, you are looking for how things scale roughly.
  • O(log n) is faster than O(n), but it gets a lot faster once the list of items you’re searching through grows
  • Algorithm speed isn’t measured in seconds.
  • Algorithms times are measured in terms of the growth of an algorithm.
  • Algorithms times are written with Big O notation.

Thanks for reading. We will see more about Big O notation in upcoming articles. If you find these articles useful, then please share them with your friends and colleagues.

Go to the Table of content or Data Structure ➡️

References:

  • Wikipedia
  • Grokking Algorithms (book).
  • Cracking the coding interviews (book).

--

--