Threesum Loop Algorithms Order of Growth as Discrete and Continuous Sum Calculus
Analysis of Algorithms
COS 265 - Data Structures & Algorithms
Analysis of Algorithms
introduction
cast of characters
Programmer needs to develop a working solution
Client wants to solve problem efficiently
Theoretician seeks to understand
Student (you) might play any or all of these roles someday
running time
"As soon as an Analytical Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will then arise—By what course of calculation can these results be arrived at by the machine in the shortest time?
"
–Charles Babbage (1864)
[ Difference Engine, Photo by geni, link ]
reasons to analyze algorithms
- Predict performance
- Compare algorithms
- Provide guarantees
- Understand theoretical basis
Primary practical reason: avoid performance bugs
Client gets poor performance because programmer did not understand performance characteristics
an algorithmic success story
N-body simulation
- Simulate gravitational interactions among \(N\) bodies
- Applications: cosmology, fluid dynamics, semiconductors, ...
- Brute force: \(N^2\) steps
- Barnes-Hut algorithm: \(N \log N\) steps, enabling new research
an algorithmic success story
Discrete Fourier transform
- Express signal as weighted sum of sines and cosines
- Applications: DVD, JPEG, MRI, astrophysics, ...
- Brute force: \(N^2\) steps
- FFT algorithm: \(N \log N\) steps, enabling new technology
[ Test Card F, George Hersee, link ]
the challenge
Q: Will my program be able to solve a large practical input?
"Why is my program so slow??
Why does it run out of memory??
"
Insight by Knuth (1970s): Use scientific method to understand performance
scientific method applied to alg. analysis
A framework for predicting performance and comparing algorithms
Scientific method:
- Observe some feature of the natural world
- Hypothesize a model that is consistent with the observations
- Predict events using the hypothesis
- Verify the predictions by making further observations
- Validate by repeating until the hypothesis and observations agree
Principles:
- Experiments must be reproducible
- Hypotheses must be falsifiable
algorithm analysis
observations
example: 3-sum
3-Sum: Given \(N\) distinct integers, how many triples sum to exactly zero?
Context: Deeply related to problems in computational geometry
$ more 8ints.txt 8 30 -40 -20 -10 40 0 10 5 $ java ThreeSum 8ints.txt 4 |
|
[ Angry Birds, by Rovio Entertainment ]
3-sum brute-force algorithm
ThreeSum.java:
source, javadoc, 1Kints.txt, 2Kints.txt, 4Kints.txt, 8Kints.txt
[ Algs4 booksite, link ]
public class ThreeSum { public static int count(int[] a) { int N = a.length; int count = 0; // check each triple // ignore integer overflow for simplicity for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) for(int k = j+1; k < N; k++) if(a[i] + a[j] + a[k] == 0) count++; return count; } public static void main(String[] args) { In in = new In(args[0]); int[] a = in.readAllInts(); StdOut.println(count(a)); } }
measuring the running time
Q: How to time a program?
measuring the running time
Q: How to time a program?
A1: :( Manually using a stopwatch
measuring the running time
Q: How to time a program?
A1: :( Manually using a stopwatch
A2: :| Using time
Unix command (time java ThreeSum ...
)
measuring the running time
Q: How to time a program?
A1: :( Manually using a stopwatch
A2: :| Using time
Unix command (time java ThreeSum ...
)
A3: :) Automatically using software!
public static void main(String[] args) { In in = new In(args[0]); int[] a = in.readAllInts(); Stopwatch stopwatch = new Stopwatch(); StdOut.println(ThreeSum.count(a)); double time = stopwatch.elapsedTime(); StdOut.println("elapsed time = " + time); }
Emperical analysis
Run the program for various input sizes and measure running time
\(N\) | \(T(N)\) |
---|---|
250 | 0.0 |
500 | 0.0 |
1000 | 0.1 |
2000 | 0.8 |
4000 | 6.4 |
8000 | 51.1 |
16000 | ??? |
\(T(N)\) is time in seconds on some particular machine
data analysis
Standard plot: Plot running time \(T(N)\) vs. input size \(N\)
data analysis
Log-log plot: Plot running time \(T(N)\) vs. input size \(N\) using log-log scale
straight line of approx slope 3 \(\lg(T(N)) = b \lg N + c\) |
Regression: Fit straight line through data points: \(a N^b\) (power law)
Hypothesis: Running time is about \(1.006 \times 10^{-10} \times N^{2.999}\) secs
(\(2.999\) is slope)
prediction and validation
Hypothesis: Running time is about \(1.006 \times 10^{-10} \times N^{2.999}\) secs
("order of growth" of running time is about \(N^3\))
Predictions:
- 51.0 seconds for \(N=8000\)
- 408.1 seconds for \(N=16000\)
Observations validate hypothesis!
\(N\) | \(T(N)\) |
---|---|
8000 | 51.1 |
8000 | 51.0 |
8000 | 51.1 |
16000 | 410.8 |
doubling hypothesis
Doubling hypothesis: Quick way to estimate \(b\) in a power-law relationship
Run program, doubling the size of the input
| \(\frac{T(N)}{T(N/2)} = \frac{aN^b}{a(N/2)^b} = 2^b\) \(\lg (6.4 / 0.8) = 3.0\) seems to converge |
Hypothesis: Running time is about \(aN^b\) with \(b = \lg \text{ratio}\)
Caveat: Cannot identify logarithmic factors with doubling hypothesis
doubling hypothesis
Q: How to estimate \(a\) (assuming we know \(b\))?
A: Run the program (for sufficiently large value of \(N\)) and solve for \(a\)
| \(51.1 = a \times 8000^3\) |
Hypothesis: Running time is about \(0.998 \times 10^{-10} \times N^3\) seconds
Note: this is almost identical hypothesis to one obtained via regression (\(1.006 \times 10^{-10} \times N^{2.999}\))
experimental algorithmics
system independent effects, determines exponent \(b\) in power law
- algorithm
- input data
system dependent effects, determines constant \(a\) in power law
- hardware: CPU, memory, cache, ...
- software: compiler, interpreter, garbage collector, ...
- system: operating system, network, other apps, ...
bad news: sometimes difficult to get precise measurements
good news: much easier and cheaper than other sciences
as an aside
Algorithmic experiments are virtually free by comparison with other sciences
Bottom line: no excuse for not running experiments to understand costs
Analysis of Algorithms
mathematical models
mathematical models for running time
total running time: sum of cost \(\times\) frequency for all operations
- need to analyze program to determine set of operations
- cost depends on machine, compiler
- frequency depends on algorithm, input data
In principle, accurate mathematical models are available.
example 1-sum
Q: How many instructions as a function of input size \(N\)?
int count = 0; for(int i = 0; i < N; i++) if(a[i] == 0) // <- N array accesses count++;
operation | cost (ns) \(\dagger\) | frequency |
---|---|---|
variable declaration | 2/5 | \(2\) |
assignment statement | 1/5 | \(2\) |
less than compare | 1/5 | \(N+1\) |
equal to compare | 1/10 | \(N\) |
array access | 1/10 | \(N\) |
increment | 1/10 | \(N\) to \(2N\) |
\(\dagger\) representative estimates (with some poetic license)
example: 2-Sum
Q: How many instructions as a function of input size \(N\)?
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) if(a[i] + a[j] == 0) count++;
Pf. Gauss
\[\begin{array}{cccccccccccc} & T(N) & = & 0 & + & 1 & + & \ldots & + & (N-2) & + & (N-1) \\ + & T(N) & = & (N-1) & + & (N-2) & + & \ldots & + & 1 & + & 0 \\ \hline & 2T(N) & = & (N-1) & + & (N-1) & + & \ldots & + & (N-1) & + & (N-1) \\ \\ & & \Rightarrow & T(N) & = & N (N-1) / 2 \end{array}\]
Note: \(0 + 1 + 2 + \ldots + (N-1) = \frac{1}{2} N (N-1) = \binom{N}{2}\)
example: 2-Sum
Q: How many instructions as a function of input size \(N\)?
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) if(a[i] + a[j] == 0) count++;
operation | cost (ns) | frequency |
---|---|---|
variable declaration | 2/5 | \(N+2\) |
assignment statement | 1/5 | \(N+2\) |
less than compare | 1/5 | \(\onehalf (N+1)(N+2)\) |
equal to compare | 1/10 | \(\onehalf N(N-1)\) |
array access | 1/10 | \(N(N-1)\) |
increment | 1/10 | \(\onehalf N(N+1)\) to \(N^2\) |
Timing is tedious to count exactly: \(\frac{1}{4} N^2 + \frac{13}{20} N + \frac{13}{10} \text{ns}\) to \(\frac{3}{10} N^2 + \frac{3}{5} N +\frac{13}{10} \text{ns}\)
simplifying the calculations
"It is convenient to have a measure of the amount of work involved in a computing process, even though it be a very crude one. We may count up the number of times that various elementary operations are applied in the whole process and then given them various weights. We might, for instance, count the number of additions, subtractions, multiplications, divisions, recording of numbers, and extractions of figures from tables. In the case of computing with matrices most of the work consists of multiplications and writing down numbers, and we shall therefore only attempt to count the number of multiplications and recordings.
"
–Alan Turing (1947)
simplification 1: cost model
Cost model: use some basic operation as a proxy for running time
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) if(a[i] + a[j] == 0) count++;
operation | cost (ns) | frequency | |
---|---|---|---|
variable declaration | 2/5 | \(N+2\) | |
assignment statement | 1/5 | \(N+2\) | |
less than compare | 1/5 | \(\onehalf (N+1)(N+2)\) | |
equal to compare | 1/10 | \(\onehalf N(N-1)\) | |
array access | 1/10 | \(N(N-1)\) | \(\Leftarrow\) |
increment | 1/10 | \(\onehalf N(N+1)\) to \(N^2\) |
cost model = array accesses, we assume compiler/JVM do not optimize any array accesses away
simplification 2: tilde notation
- estimate running time (or memory) as a function of input size \(N\)
- ignore lower order terms
- when \(N\) is large, terms are negligible
- when \(N\) is small, we don't care
ex | original: \(f(N)\) | tilde: \(g(N)\) |
---|---|---|
1 | \(\onesixth N^3 + 20N + 16\) | ~ \(\onesixth N^3\) |
2 | \(\onesixth N^3 + 100N^2 + 56\) | ~ \(\onesixth N^3\) |
3 | \(\onesixth N^3 - \onehalf N^2 + \onethird N\) | ~ \(\onesixth N^3\) |
simplification 2: tilde notation
| \[N = 1000\] \[f(N) = 166.17 \text{million}\] \[g(N) = 166.67 \text{million}\] |
Technical definition: \(f(N) \text{~} g(N)\) means
\[\lim_{N \rightarrow \infty} \frac{f(N)}{g(N)} = 1\]
simplification 2: tilde notation
- estimate running time (or memory) as a function of input size \(N\)
- ignore lower order terms
- when \(N\) is large, terms are negligible
- when \(N\) is small, we don't care
operation | frequency | tilde notation |
---|---|---|
variable declaration | \(N+2\) | ~ \(N\) |
assignment statement | \(N+2\) | ~ \(N\) |
less than compare | \(\onehalf (N+1)(N+2)\) | ~ \(\onehalf N^2\) |
equal to compare | \(\onehalf N(N-1)\) | ~ \(\onehalf N^2\) |
array access | \(N(N-1)\) | ~ \(N^2\) |
increment | \(\onehalf N(N+1)\) to \(N^2\) | ~ \(\onehalf N^2\) to ~ \(N^2\) |
example: 2-Sum
Q: Approximately how many array accesses as a function of input size \(N\)?
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) if(a[i] + a[j] == 0) // inner loop count++;
A: \(\text{~} N^2\) array accesses
Bottom line: use cost model and tilde notation to simplify counts
example: 3-sum
Q: Approximately how many array accesses as a function of input size \(N\)?
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) for(int k = j+1; k < N; k++) if(a[i] + a[j] + a[k] == 0) // inner loop count++;
\[\binom{N}{3} = \frac{N(N-1)(N-2)}{3!} \text{~} \frac{1}{6}N^3\]
A: \(\text{ ~ } \frac{1}{2} N^3\) array accesses
Bottom line: use cost model and tilde notation to simplify counts
Estimating a discrete sum
Q: How to estimate a discrete sum?
A1: Take a discrete mathematics course (MAT 215)
Estimating a discrete sum
Q: How to estimate a discrete sum?
A2: Replace the sum with an integral and use calculus
Ex1: \(1 + 2 + \ldots + N\)
\[\sum_{i=1}^N i \text{ ~ } \int_{x=1}^N x\ dx \text{ ~ } \frac{1}{2}N^2\]
Estimating a discrete sum
Q: How to estimate a discrete sum?
A2: Replace the sum with an integral and use calculus
Ex2: \(1 + 1/2 + 1/3 + \ldots + 1/N\)
\[\sum_{i=1}^N \frac{1}{i} \text{ ~ } \int_{x=1}^N \frac{1}{x}\ dx \text{ ~ } \ln N\]
Estimating a discrete sum
Q: How to estimate a discrete sum?
A2: Replace the sum with an integral and use calculus
Ex3: 3-Sum triple loop
\[\sum_{i=1}^N \sum_{j=i}^N \sum_{k=j}^N 1 \text{ ~ } \int_{x=1}^N \int_{y=x}^N \int_{z=y}^N dz\ dy\ dx \text{ ~ } \frac{1}{6} N^3\]
Estimating a discrete sum
Q: How to estimate a discrete sum?
A2: Replace the sum with an integral and use calculus
Ex4: \(1 + 1/2 + 1/4 + 1/8 + \ldots\)
\[\int_{x=0}^\infty \left(\frac{1}{2}\right)^x\ dx = \frac{1}{\ln 2} \approx 1.4427\]
\[\sum_{i=0}^\infty \left(\frac{1}{2}\right)^i = 2\]
note: integral trick does not always work
Estimating a discrete sum
Q: How to estimate a discrete sum?
A3: Use Maple or Wolfram Alpha
mathematical models for running time
In principle, accurate mathematical models are available.
In practice,
- Formulas can be complicated
- Advanced mathematics might be required
- Exact models best left for experts
mathematical models for running time
\[\begin{array}{rcl} T_N & = & c_1 A + c_2 B + c_3 C + c_4 D + c_5 E \\ A & = & \text{array access} \\ B & = & \text{integer add} \\ C & = & \text{integer compare} \\ D & = & \text{increment} \\ E & = & \text{variable assignment} \\ A–E & = & \text{frequencies (depend on algorithm, input)} \\ c_i & = & \text{costs (depend on machine, compiler)} \end{array}\]
Bottom line: we use approximate models in this course
\[T(N) \text{ ~ } c N^3\]
analysis of algorithms
order-of-growth classifications
common order-of-growth classifications
Definition: If \(f(N) \text{ ~ } c g(N)\) for some constant \(c > 0\), then the order of growth of \(f(N)\) is \(g(N)\).
- ignores leading coefficients
- ignores lower-order terms
Ex: The order of growth of the running time of this code is \(N^3\)
int count = 0; for(int i = 0; i < N; i++) for(int j = i+1; j < N; j++) for(int k = j+1; k < N; k++) if(a[i] + a[j] + a[k] == 0) count++;
Typical usage: mathematical analysis of running times, where leading coefficients depend on machine, compiler, JVM, ...
common order-of-growth classifications
Good news: the set of functions
\[1, \log N, N, N \log N, N^2, N^3, \text{and } 2^N\]
suffices to describe the order of growth of most common algorithms.
common order-of-growth classifications
order | name | description | example | \(T(2N)/T(N)\) |
---|---|---|---|---|
\(1\) | constant | statement | add two numbers | \(1\) |
\(\log N\) | logarithmic | divide in half | binary search | \(\sim 1\) |
\(N\) | linear | single loop | find the max | \(2\) |
\(N \log N\) | linearithmic | divide and conquer | mergesort | \(\sim 2\) |
\(N^2\) | quadratic | double loop | check all pairs | \(4\) |
\(N^3\) | cubic | triple loop | check all triples | \(8\) |
\(2^N\) | exponential | exhaustive search | check all subsets | \(T(N)\) |
binary search
Goal: given a sorted array and a key, find index of the key in the array.
Binary search: compare key against middle entry
- too small, go left
- too big, go right
- equal, found it!
- else, cannot find it!
// 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97};
binary search
// find: 33 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo mid hi // a[mid] > key, so go left
binary search
// find: 33 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo mid hi // a[mid] < key, so go right
binary search
// find: 33 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo md hi // a[mid] > key, so go left
binary search
// find: 33 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // ^^ // lo = mid = hi // a[mid] == key, so found!
binary search
// find: 34 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo mid hi // a[mid] > key, so go left
binary search
// find: 34 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo mid hi // a[mid] < key, so go right
binary search
// find: 34 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // |_ ^^ _| // lo md hi // a[mid] > key, so go left
binary search
// find: 34 // 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 int[] a = {11,13,14,25,33,43,51,53,64,72,84,93,95,96,97}; // ^^ // lo = mid = hi // a[mid] != key, and lo == hi, so not found
binary search: implementation
Trivial to implement?
- First binary search published in 1946
- First bug-free one in 1962
- Bug in Java's
Arrays.binarySearch()
discovered in 2006
binary search: java implementation
Invariant: if key
appears in array a[]
, then a[lo] <= key <= a[hi]
.
public static int binarySearch(int[] a, int key) { int lo = 0, hi = a.length - 1; while(lo <= hi) { int mid = lo + (hi - lo) / 2; // why not mid=(lo+hi)/2? if (key < a[mid]) hi = mid - 1; // | else if(key > a[mid]) lo = mid + 1; // | one "3-way compare" else return mid; // | } return -1; }
binary search: mathematical analysis
Proposition: binary search uses at most \(1 + \lg N\) key compares to search a sorted array of size \(N\).
Def: \(T(N) =\) number key compares to binary search a sorted subarray of size \(\leq N\)
Binary search recurrence: \(T(N) \leq T(N/2) + 1\) for \(N>1\), with \(T(1) = 1\)
binary search: mathematical analysis
Proposition: binary search uses at most \(1 + \lg N\) key compares to search a sorted array of size \(N\).
Pf sketch (assume \(N\) is a power of \(2\))
\[\begin{array}{rcll} T(N) & \leq & T(N/2) + 1 & \text{given} \\ & \leq & T(N/4) + 1 + 1 & \text{apply recurrence to 1st term} \\ & \leq & T(N/8) + 1 + 1 + 1 & \text{apply recurrence to 1st term} \\ & \vdots & \vdots & \vdots \\ & \leq & T(N/N) + \underbrace{1 + \ldots + 1}_{\lg N} & \text{stop applying,} T(1)=1 \\ & = & 1 + \lg N & \end{array}\]
the 3-sum problem
3-Sum: Given \(N\) distinct integer, find three such that \(a + b + c = 0\)
Ver0: \(N^3\) time, \(N\) space
Ver1: \(N^2 \log N\) time, \(N\) space
Ver2: \(N^2\) time, \(N\) space
Note: For full credit, running time should be worst case
Comparing programs
Hypothesis: the sorting-based \(N^2 \log N\) algorithm for 3-Sum is significantly faster in practice than the brute-force \(N^3\) algorithm.
ThreeSum.java |
ThreeSumDeluxe.java |
Guiding principle: Typically, better order of growth \(\Rightarrow\) faster in practice
Analysis of Algorithms
Memory
basics
Bit: 0
or 1
Byte: \(8\) bits
Megabyte (MB): \(1\) million (NIST) or \(2^{20}\) bytes (most CS)
Gigabyte (GB): \(1\) billion (NIST) or \(2^{30}\) bytes (most CS)
64-bit machine: We assume a 64-bit machine with 8-byte pointers
(some JVMs "compress" ordinary object pointers to 4 bytes to avoid this cost)
typical memory usage
typical memory usage for primitive types and one- and two-dimensional arrays
|
|
typical memory for Java objects
Object overhead: \(16\) bytes
Reference: \(8\) bytes
Padding: Each object uses a multiple of \(8\) bytes
Ex1: A Date
object uses \(32\) bytes of memory
public class Date { private int day; private int month; private int year; // ... } | |
typical memory usage summary
Total memory usage for a data type value:
- Primitive type: \(4\) bytes for
int
, \(8\) bytes fordouble
, ... - Object reference: \(8\) bytes
- Array: \(24\) bytes + memory for each array entry
- Object: \(16\) bytes + memory for each instance variable +\(8\) extra bytes per inner class object (for reference to enclosing class)
- Padding: round up to multiple of \(8\) bytes
Note: depending on application, we may want to count memory for any referenced objects (recursively)
Analysis of algorithms quiz
How much memory does a WeightedQuickUnionUF
use as a function of \(N\)?
A: \(\texttilde 4N\) bytes B: \(\texttilde 8N\) bytes C: \(\texttilde 4N^2\) bytes D: \(\texttilde 8N^2\) bytes E: I do not know | public class WeightedQuickUnionUF { private int[] parent; private int[] size; private int count; public WeightedQuickUnionUF(int N) { parent = new int[N]; size = new int[N]; count = 0; // ... } // ... } |
turning the crank: summary
Empirical analysis
- Execute program to perform experiments
- Assume power law
- Formulate a hypothesis for running time
- Model enables us to make predictions
Mathematical analysis
Scientific method
turning the crank: summary
Empirical analysis
Mathematical analysis
- Analyze algorithm to count frequency of operations
- Use tilde notation to simplify analysis
- Model enables us to explain behavior
Scientific method
turning the crank: summary
Empirical analysis
Mathematical analysis
Scientific method
- Mathematical model is independent of a particular system; applies to machine not yet built
- Empirical analysis is necessary to validate mathematical models and to make predictions
Source: https://cse.taylor.edu/~jdenning/classes/cos265/slides/02_AlgorithmAnalysis.html
0 Response to "Threesum Loop Algorithms Order of Growth as Discrete and Continuous Sum Calculus"
Post a Comment