📖 Algorithms and Complexity#

| words

Task1.1 Task1.2 Task1.3 Task1.4 Task1.5 References

Writing programs that work fast#

Example: evaluation of polynomials

Task: evaluate the polynomial of the form at a given \(x\)

\[y = a_1 + a_2 x + a_3 x^2 + \dots + a_k x^k\]
  • Explain the operation of this program.

  • Can you make this algorithm more efficient?

Practical Task 1.1: Evaluations the run time of the two polynomial algorithms.

Complete the coding assignment in the exercises repo in the Jupyter notebook 1_algorithms/task1.1polynomials_pre.ipynb

An algorithm is a method of solving a class of problems on a computer.

  • sequence of steps/commands for the computer to run

Relevant questions:

  1. How much time does it take to run?

  2. How much memory does it need?

  3. What other resources may be limiting? (storage, communication, etc)

Smart algorithm is a lot more important that fast computer

Professor Martin Grötschel Konrad-Zuse-Zentrum für Informationstechnik Berlin, expert in optimization

“a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms!”

Algorithms are behind any computation done in economics

  • Macro simulation models (growth, heterogenous agents, overlapping generations, etc.)

  • Computationally heavy econometrics (Bayesian, MCMC, multi-dimensional fixed effects, etc.)

  • Structural estimation with the need to resolve the model solution many thousand times

  • Counterfactual analysis, sensitivity analysis and uncertainty quantification

Estimation of static and dynamic games is one of the areas of structural econometrics requiring quick computation \(\implies\) smart algorithms

Algorithms with different complexity#

Complexity of an algorithms in the cost, measured in running time or in storage requirement, of using an algorithm to solve one of the problems in the relevant class.

Let’s look at some particular algorithms

Parity of a number#

Check whether an integer is odd or even.

Algorithm: 
Convert the number to binary
Check whether the last digit is 0 (number is even) or 1 (number is odd)

Hide code cell source

import time

def parity (n,verbose=False):
  '''Returns 1 if passed integer number is odd
  '''
  if not isinstance(n, int): raise TypeError('Only integers in parity()')
  if verbose: print('n = ', format(n, "b"))  # print binary form of the number
  return n & 1  # bitwise and operation returns the value of last bit
# check parity of various numbers
for n in [2,4,7,32,543,671,780]:
  print('n = {0:5d} ({0:08b}), parity={1:d}'.format(n,parity(n)))
n =     2 (00000010), parity=0
n =     4 (00000100), parity=0
n =     7 (00000111), parity=1
n =    32 (00100000), parity=0
n =   543 (1000011111), parity=1
n =   671 (1010011111), parity=1
n =   780 (1100001100), parity=0

Hide code cell source

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]

N = 50
kk = lambda i: 10**(i+1)+i  # step formula
n,x,std = [0]*N,[0]*N,[0]*N # initialize data lists
for i in range(N):
  k = kk(i)  # input value for testing
  n[i] = k.bit_length() # size of problem = bits in number
  t = %timeit -n5000 -r100 -o -q parity(k)
  x[i] = t.average
  std[i] = t.stdev

plt.errorbar(n,x,std)
plt.xlabel('number of bits in the input argument', fontsize=14)
plt.ylabel('run time, sec', fontsize=14)
plt.title("Run times for parity check as function of number length in bits",fontsize=16)
plt.show()

Finding max/min of a list#

Find max or min in an unsorted list of values

Algorithm:
cycle through the list once saving the current extremum value

Hide code cell source

def maximum_from_list (vars):
  '''Returns the maximum from a list of values
  '''
  m=float('-inf')  # init with the worst value
  for v in vars:
    if v > m: m = v
  return m

Hide code cell source

import numpy as np
N = 50
kk = lambda i: 2*i  # step formula
n,x,std = [0]*N,[0]*N,[0]*N # initialize data lists
for i in range(N):
  n[i] = kk(i) # size of the array
  vv = np.random.uniform(low=0.0, high=100.0, size=n[i]) 
  t = %timeit -n1000 -r100 -o -q maximum_from_list(vv)
  x[i] = t.average
  std[i] = t.stdev

plt.errorbar(n,x,std)
plt.xlabel('number of elements in the list', fontsize=14)
plt.ylabel('run time, sec', fontsize=14)
plt.title("Run times for max finder as function of the array length",fontsize=16)
plt.show()

Binary search in finite set#

Finding a discrete element between given boundaries

Example

  1. Think of number between 1 and 100

  2. How many guesses are needed to locate it if the only answers are “below” and “above”?

  3. What is the optimal sequence of questions?

Explain the operation of the code below

Hide code cell source

def binary_search(grid=[0,1],val=0):
  '''Returns the index of val on the sorted grid
  '''
  i1,i2 = 0,len(grid)-1
  if val==grid[i1]: return i1
  if val==grid[i2]: return i2
  j=(i1+i2)//2
  while grid[j]!=val:
    if val>grid[j]: 
      i1=j
    else: 
      i2=j
    j=(i1+i2)//2  # divide in half
  return j
Inputs: sorted list of numbers, and a value to find
Algorithm:
1. Find middle point  
1. If the sought value is below, reduce the list to the lower half
1. If the sought value is above, reduce the list to the upper half
import numpy as np
N = 10
# random sorted sequence of integers up to 100
x = np.random.choice(100,size=N,replace=False)
x = np.sort(x)
# random choice of one number/index
k0 = np.random.choice(N,size=1)
k1 = binary_search(grid=x,val=x[k0])
print(f'Index of x{k0}={x[k0]} in {x} is {k1}')
Index of x[7]=[84] in [ 6 15 21 34 35 61 80 84 91 92] is 7

Hide code cell source

N = 50  # number of points
kk = lambda i: 100+(i+1)*500  # step formula
# precompute the sorted sequence of integers of max length
vv = np.random.choice(10*kk(N),size=kk(N),replace=False)
vv = np.sort(vv)

n,x,std = [0]*N,[0]*N,[0]*N   # initialize lists
for i in range(N):
  n[i] = kk(i)  # number of list elements
  # randomize the choice in each run to smooth out simulation error
  t = %timeit -n10 -r100 -o -q binary_search(grid=vv[:n[i]],val=vv[np.random.choice(n[i],size=1)])
  x[i] = t.average
  std[i] = t.stdev

plt.errorbar(n,x,std)
plt.xlabel('number of elements in the list', fontsize=14)
plt.ylabel('run time, sec', fontsize=14)
plt.title("Run times for binary search as function of the array length",fontsize=16)
plt.show()

plt.errorbar(n,x,std)
plt.xscale('log')
plt.xlabel('log(number of elements in the list)', fontsize=14)
plt.ylabel('run time, sec', fontsize=14)
plt.title("Run times for binary search as function of the LOG of array length",fontsize=16)
plt.show()

Rate of growth and big-O notation#

Very useful way to talk about the rate of growth \(\leftrightarrow\) complexity of algorithms

Definition

\[f(n)=O\big(g(n)\big) \text{ as } n \to \infty \Leftrightarrow\]
\[\exists M>0 \text{ and } N \text{ such than } |f(n)| < M g(n) \text{ for all } n>N\]

In words, \(f(x) = O\big(g(x)\big)\) simply means that as \(x\) increases, \(f(x)\) certainly does not grow at a faster rate than \(g(x)\)

In measuring solution time we may distinguish performance in

  • best (easiest to solve) case

  • average case

  • worst case (\(\leftarrow\) the focus of the theory!)

Constants and lower terms are ignored because we are only interested in order or growth

Classes of algorithm complexity#

  • \(O(1)\) constant time

  • \(O(\log_{2}(n))\) logarithmic time

  • \(O(n)\) linear time

  • \(O(n \log_{2}(n))\) quasi-linear time

  • \(O(n^{k}), k>1\) quadratic, cubic, etc. polinomial time \(\uparrow\) Tractable

  • \(O(2^{n})\) exponential time \(\downarrow\) Curse of dimensionality

  • \(O(n!)\) factorial time

_images/bigO.png

How many operations as function of input size?#

  • Parity: Just need to check the lowest bit, does not depend on input size \(\Rightarrow O(1)\)

  • Maximum element: Need to loop through elements once: \(\Rightarrow O(n)\)

  • Binary search: Divide the problem in 2 each step \(\Rightarrow O(\log(n))\)

Divide-and-conquer algorithms#

_images/binary.png

Divide-and-conquer structure is what typically marks an excellent algorithm

Example

Examples of DAC algorithms:

  • Binary search

  • Quicksort and merge sort

  • Fast Fourier transform (FTT) algorithm

  • Karatsuba fast multiplication algorithm

Curse of dimensionality#

Example of a bad algorithm?

Definition

The term curse of dimensionality relates to the above exponential complexity of an algorithm.

Example

  • Many board games (checkers, chess, shogi, go) in their \(n\)-by-\(n\) generalizations

  • Traveling salesman problem (TSP)

  • Many problems in economics are subject to curse of dimensionality 😢

Allocation of discrete good#

Maximize welfare \(W(x_1,x_2,\dots,x_n)\) subject to \(\sum_{i=1}^{n}x_i = A\) where \(A\) is discrete good that is only divisible in steps of \(\Lambda\).

Let \(M=A/\Lambda \in \mathbb{N}\). Let \(p_i \in \{0,1,\dots,M\}\) such that \(\sum_{i=1}^{n}p_i = M\).

Then the problem is equivalent to maximize \(W(\Lambda p_1,\Lambda p_2,\dots,\Lambda p_n)\) subject to above.

\((p_1,p_2,\dots,p_n)\) is composition of number \(M\) into \(n\) parts.

Hide code cell source

def compositions(N,m):
    '''Iterable on compositions of N with m parts
    Returns the generator (to be used in for loops)
    '''
    cmp=[0,]*m
    cmp[m-1]=N  # initial composition is all to the last
    yield cmp
    while cmp[0]!=N:
        i=m-1
        while cmp[i]==0: i-=1  # find lowest non-zero digit
        cmp[i-1] = cmp[i-1]+1  # increment next digit
        cmp[m-1] = cmp[i]-1    # the rest to the lowest
        if i!=m-1: cmp[i] = 0  # maintain cost sum
        yield cmp
# example of compositions generation
for c in compositions(5,3) : print(c)
[0, 0, 5]
[0, 1, 4]
[0, 2, 3]
[0, 3, 2]
[0, 4, 1]
[0, 5, 0]
[1, 0, 4]
[1, 1, 3]
[1, 2, 2]
[1, 3, 1]
[1, 4, 0]
[2, 0, 3]
[2, 1, 2]
[2, 2, 1]
[2, 3, 0]
[3, 0, 2]
[3, 1, 1]
[3, 2, 0]
[4, 0, 1]
[4, 1, 0]
[5, 0, 0]

Hide code cell source

N = 10  # number of points
kk = lambda i: 2+i  # step formula
M = 20  # quantity of indivisible good in units of lambda

n,x,std = [0]*N,[0]*N,[0]*N   # initialize lists
for i in range(N):
    n[i] = kk(i)  # number of list elements
    t = %timeit -n2 -r10 -o -q for c in compositions(M,n[i]) : pass
    x[i] = t.average
    std[i] = t.stdev

plt.errorbar(n,x,std)
plt.xlabel('Number of elements in compositions',fontsize=14)
plt.ylabel('run time, sec',fontsize=14)
plt.title('Run time as function of number of compositions',fontsize=16)
plt.show()

plt.errorbar(n,x,std)
plt.yscale('log')
plt.xlabel('Number of elements in compositions',fontsize=14)
plt.ylabel('log(run time)',fontsize=14)
plt.title('Curse of Dimensionality in composition generation',fontsize=16)
plt.show()

Hint

What to do with heavy to compute models?

  1. Design of better solution algorithms

  2. Analyze special classes of problems + rely on problem structure

  3. Speed up the code (low level language, compilation to machine code)

  4. Parallelize the computations

  5. Bound the problem to maximize model usefulness while keeping it tractable

  6. Wait for innovations in computing technology (quantum computing, etc.)

Recursion#

Definition

Recursive algorithm is an algorithm that calls itself in order to solve a problem

Surprisingly powerful technique in scientific programming!

Example

Fibonacci sequence defined as

\[x_k = x_{k-1} + x_{k-2}, k>2, \quad x_0 = 1, \quad x_1 = 1\]

Imagine a program that computes Fibonacci numbers using this definition, and calls itself in the process

def fibonacci(n):
    if n == 0:
        return 1
    elif n == 1:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

for i in range(10):
    print(fibonacci(i),end=' ')
1 1 2 3 5 8 13 21 34 55 

Is this an efficient algorithm? Why or why not?

Towers of Hanoi problem#

Classic puzzle: given a board with three pegs, move a stack of disks of different size from the left-most peg to the right-most peg, moving one disk at a time and following the rule that no larger disk can be place on top of a smaller one.

Towers of Hanoi

The problem can be solved nicely by breaking it into small parts using the following algorithm:

def move(from,to):
  move one disk from --> to

def move_via(from,via,to):
  move(from,via)
  move(via,to)

def main_algorithm(n,source,aux,target):
  '''
  Inputs: number of disks n
        source peg
        auxiliary peg
        target peg
  '''
  if n==0:
    do nothing, return
  if n==1:
    move(source,target)
  if n>0:
    main_algorithm(n-1,source,target,aux)
    move(source,target)
    main_algorithm(n-1,aux,source,target)

Practical Task 1.2

Code up the recursive solution using the algorithm above in the exercises repo in the Jupyter notebook 1_algorithms/task1.2_hanoi_pre.ipynb

Solution for 4 disks requires 13 steps:

Towers of Hanoi

Bisection method#

The first of two very important classic algorithms for equation solving

Solve equations of the form (focus on scalar case today)

\[f(x) = 0, \quad x \in [a,b] \subset \mathbb{R}, \; f(a)f(b)<0\]

The latter condition requires that the function \(f(x)\) takes different signs at the endpoints \(a\) and \(b\)

Algorithm is similar to binary search, but in continuous space

Input: function f(x)
       brackets [a,b] such that f(a)f(b)<0
       convergence tolerance epsilon
       maximum number of iterations max_iter

Algorithm:
  step 0: ensure all conditions are satisfied
  step 1: compute the sign of the function at (a+b)/2
  step 2: replace a with (a+b)/2 if f(a)f((a+b)/2)>0, otherwise replace b with (a+b)/2
  step 3: repeat steps 1-2 until |a-b|< epsilon, or max_iter number of iterations is reached
  step 4: return (a+b)/2

Hide code cell source

f = lambda x: -4*x**3+5*x+1
a,b = -3,-.5  # upper and lower limits
# make nice plot
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = [12, 8]
xd = np.linspace(a,b,1000)  # x grid
plt.plot(xd,f(xd),c='red')  # plot the function
plt.plot([a,b],[0,0],c='black')  # plot zero line
ylim=[f(a),min(f(b),0)]
plt.plot([a,a],ylim,c='grey')  # plot lower bound
plt.plot([b,b],ylim,c='grey')  # plot upper bound
def plot_step(x,**kwargs):
    plot_step.counter += 1
    plt.plot([x,x],ylim,c='grey')
plot_step.counter = 0  # new public attribute
bisection(f,a,b,callback=plot_step)
print('Converged in %d steps'%plot_step.counter)
plt.show()
Converged in 22 steps
_images/0328d2b92421ab6115c22b3b3ff1a9fccf63bc50e56355725a2548cbeb849fe1.png

Practical Task 1.3: Implementing bisections method

Complete the coding assignment in the exercises repo in the Jupyter notebook 1_algorithms/task1.3_bisections_pre.ipynb

Newton-Raphson method#

The second of the two classic methods for solving an equation \(f(x)=0\), gradient based

General form

\[f(x)=0\]
  • Equation solving

  • Finding maximum/minimum based on FOC, then \( f(x)=Q'(x) \)

Derivation for Newton method using Taylor series expansion#

\[ f(x) = \sum_{k=0}^{\infty} \frac{f^{(k)}(x_0)}{k!} (x-x_0)^k \]

Take first two terms, assume \( f(x) \) is solution, and let \( x_0=x_i \) and \( x=x_{i+1} \)

\[ 0 = f(x) = f(x_i) + f'(x_i) (x_{i+1}-x_i) \quad \Rightarrow \quad x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)} \]

The main ides of Newton-Raphson method is to iterate on the equation starting from some \(x_0\)

\[ x_{i+1} = x_i - \frac{f(x_i)}{f'(x_i)}, \; i=1,2,\ldots \]

Applicable to the system of equations, in which case \(x\in\mathbb{R}^n\) and \(f: \mathbb{R}^n \to \mathbb{R}^n\)

Input: function f(x)
       gradient function f'(x)
Algorithm:
1. Start with some good initial value
2. Update x using Newton step above
3. Iterate until convergence
_images/b3f89fad6c64a35ea7d1d7d1174904c778465250f620815cc8eb754981cee994.png _images/b0a17936fa013cb781af885ae69ffacb5e4b2c37a1aee5384fe21043d276aedc.png _images/86056dc4f11db934d9fd32cd4aef75c00f7494bd2fe052d968d2182a4ec3001a.png _images/9855d8e6b07f3026bffe95c5c68f6b0e6abc13773253a2564d9fb33e22e71775.png _images/c7573805d21852a492b4bc887f861e00a62c7be50fa0fc9832f83b5bd48c1176.png _images/72063d0192a1944d7eee5acc82fa89b563a8ceab66994110892f1718a918b282.png _images/614281950d9944f572c4d930dc0d308adf494e023d3ffbfc8f2db9f24d83d428.png
Converged in 7 steps

Practical Task 1.4: Implementing Newton-Raphson method

Complete the coding assignment in the exercises repo in the Jupyter notebook 1_algorithms/task1.4_newton_pre.ipynb

Practical Task 1.5: Multivariate Newton method [optional]

Complete the coding assignment in the exercises repo in the Jupyter notebook 1_algorithms/task1.5_multivariate_pre.ipynb

Measuring complexity of Newton and bisection methods#

  • What is the size of input \( n \)?

  • Desired precision of the solution!

  • Thus, attention to the errors in the solution as algorithm proceeds

  • Rate of convergence is part of the computational complexity of the algorithms

Computational complexity

  • Calculating a root of a function f(x) with n-digit precision

  • Provided that a good initial approximation is known

  • Is \( O((logn)F(n)) \), where \( F(n) \) is the cost of

  • calculating \( f(x)/f'(x) \) with \( n \)-digit precision

References and Additional Resources