Introduction
You probably have heard of
Fibonacci numbers.
They are the numbers in the sequence $1,1,2,3,5,8,13,\dots$ where each number is the sum of previous two numbers. So the next number after $13$ is $8+13=21$, and then $13+21=34$ and so on.
A combinatorial way of looking at them is the following:
Suppose you wanted to go up stairs having n steps, going up 1 or 2 at time. How many ways can you do this?
For instance, if there were 3 steps, you could go up as follows
$1,1,1$
$1,2$
$2,1$
where $1,2$ means go up one step, then go up two steps.
That the total number is a Fibonacci number is easy to see. If the number for $n$ steps was $F_n$, then the first number of steps is either one or two, and thus $F_n = F_{n-1} + F_{n-2}$.
We get $F_0 = 1, F_1 = 1, F_2 = 2, F_3 = 3, F_4 = 5, \dots$.
This combinatorial approach is a useful way [
among many other approaches, like the matrix based approach] of looking at Fibonacci numbers. For instance, try proving the following identity combinatorially:
$$ \binom{n}{0}F_0 + \binom{n}{1}F_1 + \dots + \binom{n}{n}F_n = F_{2n} $$
($\binom{n}{k}$ is the binomial coefficient: $n$ choose $k$)
We will use this combinatorial approach to demonstrate an $O(\log n)$ arithmetic operations algorithm to compute the $n^{th}$ Fibonacci number. [
The matrix approach also gives an $O(\log n)$ algorithm]
Note, we are talking about $O(\log n)$
arithmetic operations, rather than $O(\log n)$
time. This is because the size of the Fibonacci numbers becomes substantial. Since $F_n \sim \varphi^n$, $\varphi$ being the golden ratio, $F_n$ will be $\Theta(n)$ bits in size.
Adding $F_n$ and $F_n$ will take $\Theta(n)$ time, and multiplying $F_n$ and $F_n$ would typically be slower than that, its run-time depending on the
multiplication algorithm used.
[
Please stop reading if you want to try coming up with an algorithm yourself]
An $O(\log n)$ algorithm
So, we want to compute $F_n$, the number of ways to climb a stair of $n$ steps, going up one or two at a time. Assume for the moment, that you don't know that $F_n$ is a Fibonacci number.
To get an $O(\log n)$ algorithm, an effective technique is to reduce the problem in half (e.g: binary search).
So, if we could compute $F_n$ in terms of $F_{n/2}$ (or similar) we would have an $O(\log n)$ algorithm.
So let us try to compute $F_{2n}$ in terms of $F_n$.
Of all the ways to go up $2n$ steps, each possibility will have us end up at step $n-1$ or $n$ at some intermediate stage.
So we count the total number by splitting into two: possibilities that go through step $n-1$ but not step $n$, and possibilities that go through step $n$.
We get the following identity:
$$F_{2n} = F_{n-1}^2 + F_{n}^2 $$
And similarly:
$$ F_{2n+1} = F_{n}F_{n-1} + F_{n+1}F_{n}$$
Thus we can compute $F_{2n}$ if we knew $F_{n-1}$ and $F_{n}$, and we could compute $F_{2n+1}$ if we knew $F_{n-1}, F_{n}$ and $F_{n+1}$.
A naive recursive implementation of these identities would give an algorithm which requires $\Omega(n)$ arithmetic operations.
A common way to speed up such recursive algorithms is to use
memoization: cache the previously computed values and use those instead of recomputing them.
As is frequently the case, using memoization gives us an exponential speedup, leading to an $O(\log n)$ arithmetic operations algorithm!
To prove that, consider what values we need, to compute the four Fibonacci numbers $F_{2n+1}, F_{2n+2}, F_{2n+3}, F_{2n+4}$:
$$F_{2n+1}, F_{2n+2}, F_{2n+3}, F_{2n+4} \to F_{n-1}, F_{n}, F_{n+1}, F_{n+2}\quad (1)$$
And to compute $F_{2n}, F_{2n+1}, F_{2n+2}, F_{2n+3}$:
$$F_{2n}, F_{2n+1}, F_{2n+2}, F_{2n+3} \to F_{n-1}, F_{n}, F_{n+1}, F_{n+2}\quad(2)$$
Thus starting with $F_{2n}$ (or $F_{2n+1}$) we can see that ultimately, we need to only compute a total of $O(\log n)$ such $F_i$.
For example, starting with $F_{8n+4}$:
$$F_{8n+4} \to F_{4n+1}, F_{4n+2} \to F_{2n-1}, F_{2n}, F_{2n+1} \to F_{n-2}, F_{n-1}, F_{n}, F_{n+1}$$
To compute $F_{8n+4}$, we need to compute two Fibonacci numbers of half the size ($F_{4n+1}, F_{4n+2}$). To compute those two, we need to compute three more of half their size ($F_{2n-1}, F_{2n}, F_{2n+1}$), and then four more. Based on $(1)$ and $(2)$ above, the amount of new numbers added to compute then stays at four, and never increases. At each step we halve the size, and so the total number of numbers is $O(\log n)$.
The number of arithmetic operations associated with computing $F_i$ is $O(1)$, if we use a memoization table, and so the whole algorithm will be using $O(\log n)$ arithmetic operations.
We can get rid of the memoization table by trying to compute the tuple of 4 elements instead: $T_n = (F_{n}, F_{n+1}, F_{n+2}, F_{n+3})$ and noticing that $T_n$ can directly be computed using $T_{\lfloor n/2 \rfloor -1}$.
[
If you look at the matrix approach, you can view it as computing a tuple of three elements]
By massaging the identities a bit more, and using $F_{n+2} = F_{n+1} + F_{n}$, we can in fact, reduce the size of the tuple to two.
Further optimizations can be done, to reduce the number of multiplications (as they are slower than additions for large sized integers). See the
GNU gmp Fibonacci implementation which actually does that by using similar identities to what we had earlier.
Conclusion
We looked at a combinatorial way of looking at Fibonacci numbers, and used that to give an $O(\log n)$ arithmetic operations algorithm.
This algorithm naturally falls out by applying two basic algorithm techniques: divide and conquer (by halving the input size) and caching (using memoization on the recursive version) and does not require any 'major leaps' (like the matrix approach).