[WPI] [CS] [cs504] [Syllabus]
This is a loose compilation of some of the materials presented during class on 15 February in Waltham and 17 February in Worcester
This class was about techniques for solving recurrence relations. If you haven't already done so, please read Chapter 2 of Sedgewick and Flajolet.
Recurrence relations describe how one element of a sequence can be computed from one or more of the previous elements in the sequence. There is no one way to solve recurrence relations, but they do come in families - in that respect they are similar to their analogue in continuous mathematics, the differential equation. It is worthwhile classifying a recurrence relation since that often gives clues on how to solve it. See the discussion on classification in Table 2.1 on p.38 of Sedgewick and Flajolet and in the Recurrence Relation Notes on the Notes page. While you are there, look at the examples of how to form recurrence relations from word problems and algorithms.
The rest of this class comprises examples of how to solve recurrence relations.
Many of the recurrence relations you will run into have already been solved. NJA Sloane has compiled an extensive on-line database of sequences.
Every summation has an implicit recurrence relation which can be found by using the first step of the perturbation method for solving summations.
Thus any linear, first-order recurrence relation with equal coefficients for the two terms can be solved - if the underlying summation can be solved. Table 22. on p42 of Sedgewick and Flajolet summarizes a few solvable summations.
Tables of solved series are contained in I.S. Gradshteyn and I.M. Ryzhik Alan Jeffrey, Table of Integrals, Series, and Products, 5th Ed (1993, Academic Press) and other published resources. A CD-ROM version of the book is also available.
Thus the solution to the above recurrence relation is
Notice that an initial condition has been added to the solution. The recurrence relation is first-order so only one initial condition is required.
Hashing algorithms are used to store data where the set of possible keys (the data used for accessing the data) exceeds the size of the data set. For example, WPI student IDs are ten digits - 123-45-6789. CS504 has fewer than 100 registered students. The simplest way to store the numbers is to make an array of size 10^9 - the number of possible student IDs - and simply use the ID to look up the data in the database.
This is somewhat inefficient. Suppose we store the number by the last two digits. Then only 100 bins are available so the data ought to fit:
The problem, of course, is that two or more of the students may have the same last two digits in their IDs. We want to calculate the expected value of C, the number of comparisons necessary to store the N-th data set in the data structure, assuming that M of the bins are already filled. Clearly CN is at least one - we look in the bin with the number of the last two digits in the ID to see if it is already occupied. If the bin is already occupied, we use a hashing function which provides a repeatable method for selecting alternative bits. There are many possible hashing functions - such as looking in subsequent bins until an empty one is find - but that is not what we are studying in this problem. We merely want to calculate the average number of comparisons necessary to find an empty bin.
The probability that N bins have been searched but an empty one still has not been found is:
Thus the probability that an empty bin is found in the N-th comparison is:
The expected value is:
To make it easier to solve the summation, increase the upper limit to infinity:
We can calculate the summation by starting with the geometric sequence.
Except for the restriction on its absolute value, alpha can be anything. Therefore we can differentiate both sides of the last equation with respect to alpha:
Thus the expected value of the number of comparisons is:
Notice when the data structure is empty (N=0), one comparison is required; when the data structure is half full (N=M/2), an average of two comparisons is required; when 9/10 of the bins are filled (N=0.9), an average of ten comparison is required; when the data base is full (N=M), an infinite number of comparisons is required because non of them could possibly succeed.
The Notes page contains information on solving these recurrence relations. The technique there works for recurrence relations where the "forcing function" is polynomial in N, exponential in N, or a product of the two.
This technique is the analogue of the integrating factor technique for solving differential equations. The technique is applicable to any linear, first-order recurrence relation:
The two functions BN and CN can be almost any functions of N. The way we remove the BN so that the becomes a simple recurrence relation is to make a substitution of variables:
With this substitution, the recurrence relation becomes:
Now you see the "almost" mentioned above. if any of the BN is zero, the solution cannot be found. We can find the solution by performing a sequence of substitutions:
When we take this sequence all the way back to the initial condition, the solution is:
This is exactly the result we obtain from using the summing technique near the top of this page.
If we undo the substitution of variables, the solution is
In Class 2 we showed that the number of steps necessary to move a stack of N rings in the Tower of Hanoi problem is:
We can fit this into the solution above by using these definitions, which include the fact that no steps are required if there are no rings:
The solution is:
Example 2.11 on p43 of Sedgewick and Flajolet asks you to find the solution to this recurrence relation:
The summing factor method can be used to solve this recurrence relation (but not easily). Or, we can use the hint suggested in the text. Multiply both sides by (N-1):
Now we can define a new sequence BN which is related to AN by this expression:
We substitute this into the original recurrence relation to obtain a new one for the sequence BN:
We also need to calculate the equivalent initial condition:
The solution can be calculated using the method in the Notes page or the summation method above:
When we undo the substitution, we obtain the final answer:
It is straightforward to see that this is, in fact, the correct solution to the original recurrence relation.
Assume we have an array with N sorted values in it. Assume that N is a power of 2, although that is not strictly necessary. How many comparisons are required to tell whether a number is in the array?
The first step is to look half way down the list. If the number is not found, it is either above the half-way point (less than the number there) or below that point. Now look at the half-way point of the part of the list which could contain the number. If this algorithm is applied recursively, a recurrence relation results for the number of comparisons:
The initial condition results from noting that a list of size one requires exactly one comparison - to see whether the number in the list is the number we are seeking.
Again, let's try a substitution of variables:
The recurrence relation becomes:
The problem with this sequence is that it only exists for powers of 2. There are gaps in the sequence. That makes it hard to solve using any of the above technique. If we can match this sequence, term-by-term, with another which has no gaps, then we can use the solution to the second sequence to find the solution to the first:
Create a sequence BK which is matched term-by-term with the first.
The solution is:
When we undo the substitutions, we obtain the final answer: