![]() |
|
|||||
|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Learning ObjectivesThe purpose of this lesson is to give you the skills to be able to evaluate and analyze algorithms. At the end of this lesson, you should be able to:
Implementation and Empirical AnalysisAnalysis plays a very important role in the process of designing and implementing algorithms. We saw in the previous chapter how we can save factors of thousands or millions in running time with the right algorithm design choices. Algorithm analysis is an important tool in the pursuit of the best algorithm. By the end of this chapter, you will be able to appreciate and understand how algorithm analysis can help us design better algorithms and choose the best among the choices available. Thus, analysis provides us an understanding of the performance characteristics of algorithms. One of the simplest steps we can take to compare algorithms is to do empirical analysis: Given two algorithms to solve the problem, run them both to see which one takes longer! Seemingly simple, empirical analysis provides us with the extra information that is easy to overlook when doing the mathematical analysis. It can also be used to validate our mathematical analysis. However, when empirical studies take significant amounts of time, mathematical analysis is called for. The first challenge faced is to develop a correct implementation. For complex algorithms, they may be a difficult obstacle. For such algorithms, mathematical analysis is called for. The second challenge faced in empirical analysis has to do with the testing of the algorithm - the challenge is to determine the nature of the input data and other factors that directly influence the experiments to be performed. Empirical analysis thus typically compares between different algorithm implementations. The input data we use in the analysis might be: actual data, random data or perverse data. Actual data is supposed to measure the running cost of the algorithm with the data that we might expect in real life. Random data provides us with a measure of the average running cost of the algorithm, assuming that all data occurs with the same probability. Perverse data is expected not to occur in real life. It is used to validate the correct operation of the algorithm even with the most unexpected data i.e. to measure the worst case performance of the algorithm. Care should be taken to provide the same type of input to the algorithms to be compared. Algorithms which run on different kinds of input: more efficiently different kinds of inputs may not be comparable at all. Care should also be taken to run the algorithm implementations on the same kind environments and given the same resources, otherwise the results will not be valid. Choosing among algorithms to solve a given problem is a tricky business - algorithm analysis makes the task easier. The most common mistake made in selecting an algorithm is to ignore the performance characteristics. Faster algorithms are more complicated than brute force methods, so using a slower algorithm takes the easy way out. However as we say in the last chapter, sometimes huge savings can be made with just a few lines of code which can turn a N2 algorithm into a N log N algorithm. The other most common mistake people make is to pay too much attention to performance characteristics! Frequently, more complex algorithms are faster for larger input sizes - simpler algorithms may be faster for small input sizes. Sometimes the investment made in implementing a faster algorithm may not be justified by the returns. So it is a design decision whether to develop a newer, faster implementation or go with an older, slower implementation. Algorithm analysis is essential for finding the best algorithm. Growth of FunctionsThis section introduces some simple characterizations of algorithm efficiency. The term 'growth of functions' is used to indicate how the algorithm performs for input data of different sizes - size 1, 2, 3 ... N where N is called the primary parameter that effects the running time. Typically, algorithms have running times proportional to one of the following. The running time of a particular program is likely to be a constant multiplied by one of these terms: 1 - All single-instructions execute in constant time. log N - The program running time gets slightly larger as the size of the input increases. N - The running time of the program is linear i.e. as N doubles, the running time also doubles. N log N - This running time typically arises when algorithms solve a problem by breaking it up into smaller sub-problems, solving them independently and then combining the solutions. This is still better than a running time of N2 N2 - As N doubles, the running time increases four fold. N3 - As N doubles, the running time increases eight fold. 2N - Some algorithms have exponential running time, unfortunately! The are not practical for everyday use. The running time of a particular program is likely to be a constant multiplied by one of these terms plus some smaller terms. Roughly, the value of the coefficient of the leading term (coefficient of the largest N-term) has to do with the number of instructions in the inner loop of the algorithm - to reduce the running time, we must minimize the number of instructions in the inner loop. Some standard notation - └ x ┘ (floor) : largest integer less than or equal to x, e.g. └ 3.141 ┘ = 3 ┌ x ┐(ceiling) : smallest integer greater than or equal to x, e.g. ┌ 2.81 ┐= 3 For even N, └ N/2 ┘ = ┌ N/2 ┐ For odd N, └ N/2 ┘ + 1 = ┌ N/2 ┐ Harmonic Numbers: HN = 1 + 1/2 + 1/3 + ... + 1/N ~ ln N + 0.57721 + 1/(12N)
Fibonacci numbers are the sequence of numbers 0 1 1 2 3 5 8 13 21 34 55 defined by FN = FN-1 + FN-2
Stirling's formula to approximate the factorial function N!: lg N! ~ N lg N - N lg e + lg √ (2*Π*N) Big-O NotationDefinition - A function g(N) is said to be O(f(N)) if there exist constants c0 and N0 such that g(N) < c0f(N) for all N > N0 The function g(N) is said to be "of the order of" f(N) - and this is denoted by the big-O notation as O(f(N)) What does this definition mean intuitively? For large values of N (large input size), the performance of the function (the algorithm g(N)) is the polynomial f(N) multiplied by a constant. The big-O notation refers to the asymptotic performance of algorithms and hence the order of a polynomial is the order of the largest exponent term of the polynomial. e.g. The expression (N + O(1)) (N + O(log(N) + O(1)) is: N2 + O(N) + O(N log N) + O(log N) + O(N) + O(1), which can be reduced to N2 + O(N log N) , by keeping the largest-exponent terms. In other words, the asymptotic behavior of the original expression is the same as the behavior of the largest-exponent terms because the contribution of the smaller-exponent terms are becomes increasingly small at large N. Why is big-O notation useful? Big-O notation allows algorithms to be compared without knowing the exact running time of a single instruction on a particular computer. aFor example, suppose that an algorithm has an inner loop that is executed 2NHN times on the average, an outer section that is executed N times, and an initialization section that is executed once. If each iteration of the inner loop takes a nanoseconds, each iteration of the outer loop takes b nanoseconds and the initialization code takes c nanoseconds, then the time taken by the program on average is 2 a * N * HN + b * N + c In terms of big-O notation, the running time is: 2 a * N * HN + O(N) removing the smallest exponent term and : O( N * HN + N ) removing the constant terms from the expression Since HN = ln N + O(1), the running time can be expressed as 2 a N ln N + O(N) or O( N ln N + N) Let us see what happens to the running time as the input size doubles from N to 2N: (2a (2N) ln (2N) + O(2N)) / (a N ln N + O(N)) is: = (2 ln(2N) + O(1))/ (ln N + O(1)) = 2 + O(1/logN) Thus the running time more than doubles: big-O allows us to make accurate predictions without knowing the value of a or the implementation details. Big-O notation also provides an upper bound for the running time of the algorithm g(N). For large values of N, the value of g(N) always remains bounded with a constant factor times f(N).
Figure 1. When we say that g(N) = O(f(N)), we mean that the value g(N) falls below some curve of the shape of f(N) and to the right of a vertical line n0. When f(N)/g(N) tends to 0 as N becomes very large, we sometimes say f(N) to mean O(g(N)) e.g. N(N-1)/2 approximates N2/2 when N is large. Similarly, we can say that the running time of an algorithm is proportional to f(N) when we can prove that it is equal to cf(N) + g(N) and g(N) is asymptotically smaller than f(N). Basic Recurrences A lot of algorithm analysis is based on recurrences - this is because a lot of algorithms recursively decompose a large problem into a lot of smaller ones to solve the larger problem. Consider the following recurrence: CN = CN-1 + N for N>1 with C1 = 1 To find the solution to this recurrence, we apply the equation to itself i.e. we expand it recursively: CN = CN-1 + N = CN-2 + (N - 1) + N = CN-3 + (N - 2) + (N - 1) + N ...... = C1 + 2 + ........ + (N - 2) + (N - 1) + N = 1 + 2 + ....... + (N - 2) + (N - 1) + N = N*(N-1)/2, where we have used the result that the sum of first N natural numbers is N*(N-1)/2 . Such formulas which describe the running time of algorithms by expressing the running time on an input of size N in terms of sums of the running times on smaller inputs are called recurrence relations. Now consider this recurrence: CN = CN/2 + 1 for N>1 with C1 = 1 Solving this recurrence, we see that each time we expand it recursively, we will have terms like CN , CN/2 , CN/4 , CN/8 and so on. If we express N as 2n for some n, then we can express CN as C2n , CN/2 as C2n-1, CN/4 as C2n-2 , CN/8 as C2n-3 and so on. It turns out that the mathematics will be easy in solving the recurrence when we make this substitution. Note also that the recurrence is only well defined for subscripts which are even integers, so that N/2 is a natural number. Solving, we get C2n = C2n -1 + 1 = C2n -2 + 1 + 1 = C2n -3 + 3 ..... = C20 + n = C1 + n = 1 + n = 1 + lg N For the case where N/2 represents └N/2┘, CN = 1 +└lg N┘which is the number of bits in the binary representation of N. The following recurrence arises for a recursive program that halves the input, but adds an overhead equal to the size of the input: CN = CN/2 + N for N > 1 with C1 = 0 = CN/4 + N/2 + N = CN/8 + N/4 + N/2 + N ... = N + N/2 + N/4 + N/8 + .... The relation is precisely defined only when N is a power of 2. If this were an an infinite expansion above, this would be 2N exactly. For 'real' relations, we would use integer division and stop at 1, in which case 2N is an approximate answer. The following recurrence arises for a program that makes a linear pass through the input by splitting it into two halves, and doing an amount of work equal to the size of the input to combine the two halves to solve the original problem : CN = 2 CN/2 + N for N > 1 with C1 = 0 Substituting N = 2n C2n / 2n = C2n-1 / 2n-1 + 1 = C2n-2 / 2n-2 + 1 + 1 .... = C20/20 + n = n (since C1 = 0) Hence, CN is about N lgN. The following recurrence arises for a program that makes a linear pass through the input by splitting it into two halves, and does a constant amount to solve the original problem: CN = 2 CN/2 + N for N > 1 with C1 = 0 Using the same substitution as previously, we find that CN is about 2N. The recurrences given in this section have given you the idea to be bale to solve variants of these recurrences with minor variations in conditions. There is a variety of advanced techniques to solve more complex recurrences. Examples of Algorithm Analysis Now that we have some idea of how to solve recurrences we will look at actual algorithms and try to analyze their behavior using recurrences. We will consider two algorithms called sequential search and binary search. Sequential Search Description: Given an array of size M, to sequentially check if each element of the array matches with one of N 'wanted' elements. The following procedure implements sequential search. All objects are stored in an array a[]. For each transaction, we look through the array from beginning to end, checking to see if that array element is the one we seek - given by v. The range of the array is 1.. r The C version:
int search(int a[], int v, int l, int r)
{
int i;
for (i = l; i <= r; i++)
if (v == a[i]) return i;
return (-1);
}
The C++ version:
int search(int a[], int v, int l, int r)
{
for (int i = l; i <= r; i++)
if (v == a[i]) return i;
return (-1);
}
In fact, both versions are identical.
Since each search looks through not more than the length of the array, we can guarantee that no more that N elements will be examined, where N is the size of the array. Note that N will be the actual parameter for the formal parameter r in the function. To make a prediction about how many searches will be actually performed for each function call, we can understand that it will depend on the data in the array and the desired search element - hence we need to make an assumption about the data. If we choose to make the assumption that numbers are randomly chosen and that each number is likely to be the object of a search. Making guarantees and predictions involves analysis of different types. Property 2.1 Sequential search examines N numbers for each unsuccessful search and about N/2 numbers for each successful search on the average. The property for an unsuccessful search is straightforward. For a successful search however, since numbers are randomly chosen and distributed - the object of a search may be anywhere in the array from position 1 to position N. On an average, we will search for (1 + 2 + 3 + ..... + N) / N = (N + 1)/2 positions The average cost of a search is thus N/2 approximately. We can speed up sequential search by putting the array a[ ] in order. Sorting the array can take N log N time for a number of algorithms to be considered later. In an ordered table, we can terminate the search immediately on reaching a number that is larger than the one we seek. This changes the cost of sequential search. Property 2.2Sequential search in an ordered array examines N numbers for each search in the worst case and about N/2 numbers for each search on the average (whether successful or unsuccessful) Assuming that the search is equally likely to terminate at one of the N+1 intervals defined by the N numbers in the array, the cost of an unsuccessful search will be on the average (1. + 2 + 3 + ..... + N + N)/N = (N + 3)/2, in which the desired element is less than each of N elements in the array, and plus one more case where the desired element is greater than the last element in the array. Binary Search Binary search is based on the idea that if the numbers in the array are stored in order, we can eliminate half of them from consideration by comparing the one we seek with the element at the middle position in the array. If it is equal to the one we seek, we have a successful search. If it is less than the one we seek, we have to look at the half of the array which contains larger values. If it is greater, we have to look at the half of the array which contains smaller values. In both cases, we apply the binary search again to an array of half the original size. The code fragment for a binary search is given below. Where v is the element to be compared, the range of the array is l ... r, and m is the center element in the array. The C version:
int search(int a[], int v, int l, int r)
{
while (r >= l)
{
int m = (l+r)/2;
if (v == a[m]) return m;
if (v < a[m]) r = m-1; else l = m+1;
}
return (-1);
}
The C++ version is identical:
int search(int a[], int v, int l, int r)
{
while (r >= l)
{
int m = (l+r)/2;
if (v == a[m]) return m;
if (v < a[m]) r = m-1; else l = m+1;
}
return (-1);
}
Property 2.3 Binary search never examines more than └lg N┘ + 1 numbers. If we let TN to be the number of comparisons required for binary search in the worst case in an array of size N, then TN/2 represents the time to search in an array of half the size. Since the cost of binary search in an array of size N is at most the cost to search in an array of size N/2 plus a constant overhead, TN =< TN/2 + 1 for N > 1 with T1 = 1 This is an upper bound on TN because a successful search will allow us to stop the comparison further and we will not need to consider the smaller array. This recurrence was solved in the section on recurrences, and it was found that TN is approximately └lgN┘ + 1 We can see that binary search is better than sequential search, especially for large values of N. An interesting exercise would be to do an empirical comparison of the two searches. Summary: Guarantees, Predictions and LimitationsIn this chapter, we studied a part of what is called complexity theory which deals with the analysis of algorithms. We used the big-O notation to provide us an upper bound on the performance of algorithms. We were able to divide algorithm analysis into worst-case and average-case behavior. The goal of these two types of algorithm analysis is to provide:
Algorithm analysis is an important component in the overall design and development of algorithms | |||||
|
| |
| Added to the Web: September 30, 2007. Web page design by Dan Solarek. |
http://cset.sp.utoledo.edu/cset3150/ |