WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

CS2223 Algorithms 
Homework 5 - B Term 2005

By Piotr Mardziel 

------------------------------------------

  1. Read Chapter 5 of the textbook in detail.

  2. Problems: You need to turn in a hard copy of your written solutions to the following exercises at the beginning of class on the day the homework is due. Remember to show your work and explain your answers. Correct answers without complete explanations won't receive full credit. Incorrect answers with explanations may receive partial credit.

    1. (25 points)Chapter 5: exercise 6.

      Remember to:

      • (8 points) Provide a detailed divide-and-conquer algorithm to solve this problem (that is, to find a local minimum in a given such complete binary tree).
        Consider the following algorithm:

        Find-Local-Min(T) // T is the root of the Tree //
        • If T has children
          • Let L and R be T's children
              // 3 probes //  
          • Probe the values of xL, xR, and xT
          • If xL < xT
            • Return Find-Local-Min(L) // find local minimum on the left subtree //
          • Else If xR < xT
            • Return Find-Local-Min(R) // find local minimum on the right subtree //
          • Else
            • Return T
        • Else // T is a leaf (no children) //
          • Return T
      • (8 points) Prove that your algorithm is correct (that is, it always returns a local minimum in the input complete binary tree).
        First let's show the following:

          Claim: At any point in the execution of the algorithm, the parent (if any) of T has a greater value than T itself.
          Proof: Consider first the case in which T is the root of the entire tree. In this case T has no parent and the claim holds vacuously. Otherwise consider the execution of the method for the parent of T. The only way for the algorithm to continue recursively down to T required one of the highlighted conditions to hold true. If T was a left child of its parent than the first condition had to hold true which states that the value of T is less than that of the parent. Similarly for the case where T is a right child.

        Given this fact consider now the conditions in which T is returned by the algorithm. The first is when both its children have a greater value than itself. T is certainly a local minimum here as the parent was shown to have a greater value as well. The other way to return T is when it is a leaf (no children). In this case the only node T could be connected to is its parent and again we know that the parent's value is greater than that of T and hence T is a local minimum.
      • (9 points) Prove rigorously that your algorithm uses only O(log n) probes to the nodes in the input complete binary tree.
        Remember that the number of nodes in a complete binary tree is n=2d-1 and thus its depth is d which can be represented as lg(n+1) by solving for d (note lg is same as log2).

        Now notice that every time a recursive call is made in the algorithm, it is called with a node at a level 1 lower than the currently considered node. Thus the algorithm will recursively traverse a path down the tree which will take at most d steps. At each point three probes are performed hence the total number of probes will be O(3*d) = O(3*lg(n+1)) = O(lg(n)).

    2. (20 points) Using the recursion-tree method (or as the textbook calls it, "unrolling" the recurrence), prove that the recurrence
      T(n) = 2T(n/2 + 17) + n
      is O(n log n).

      Note: In case you find the notation above ambiguous, T(n/2 + 17) = T((n/2) + 17).

      First let's impose the following assumption on the base case of the recursion:

      • $$ T(n) \leq c $$ if $$ n \leq 35 $$
      • $$ T(n) = 2 T(\frac{n}{2}+17) + n $$ otherwise

      Now we can use the "unrolling" method to find the values for each node in the recurrence tree.
        \begin{eqnarray*}
l_0 & = & n \\
    & = & \frac{1}{2^0} n \\
l_1 & = & \frac{n}{2} + 17 \\
    & = & \frac{1}{2^1} n + 17\left(\frac{1}{2^0}\right) \\
l_2 & = & \frac{\frac{n}{2} + 17}{2} + 17 \\
    & = & \frac{1}{2^2} n + 17\left(\frac{1}{2^0} + \frac{1}{2^1}\right) \\
l_3 & = & \frac{\frac{\frac{n}{2} + 17}{2} + 17}{2} + 17 \\
    & = & \frac{1}{2^3} n + 17\left(\frac{1}{2^0} + \frac{1}{2^1} + \frac{1}{2^2}\right) \\
    & ... & \\
l_i & = & \frac{1}{2^i} n + 17 \sum_{j=0}^{i-1} \frac{1}{2^j} \\
    & \leq & \frac{1}{2^i} n + 2 * 17
\end{eqnarray*}
      Since we are looking for an upper bound, the simplification of $ \sum_{j=0}^{i-1}\frac{1}{2^j} \leq 2 $ is reasonable (note that the limit of that series (i.e. a power series) approaches 2 as $$ i $$ tends to $$ + \infty $$).

      The number of "nodes" of this recurrence tree doubles for each increase in depth. Thus the number of nodes at level $ i $ is $ 2^{i} $. We can now write a general formula for the total of each level:

        \begin{eqnarray*}
L_i & = & l_i 2^i \\
    & \leq & \left(\frac{1}{2^i} n + 34\right) 2^i \\
    & = & n + 34 * 2^i
\end{eqnarray*}

      Having determined the contribution of each of the levels in the recurrence tree we next need to find the depth of this tree before the base case of the recurrence takes over. To do this we need to find the how big the input to $$ T $$ is for each level. Fortunately since $$ T(n) = n + ... $$ this will exactly correspond to the contribution of the nodes of each level to the overall tree. Hence we need to find $$ i $$ such that: $$ l_i \leq 35 $$.

        \begin{eqnarray*}
l_i                  & \leq & 35 \\
\Rightarrow \frac{1}{2^i} n + 34 & \leq & 35 \\
\Rightarrow \frac{1}{2^i} n      & \leq & 1 \\
\Rightarrow n                    & \leq & 2^i \\
\Rightarrow \lg{n}              & \leq & i
\end{eqnarray*}

      Thus we have for the final level $ i \geq \lg{n} $, the contribution of each node is at most $ c $ as defined by the base case of the recurrence. We can now define the total contribution of the final level as $ L_{\lg{n}} = 2^{\lg{n}} c = cn $. Thus we have the total value of the entire tree:

        \begin{eqnarray*}
 &   & L_{\lg{n}} + \sum_{i=0}^{\lg{n}-1}L_i \\
 & = & cn + \sum_{i=0}^{\lg{n}-1}\left(n+34*2^i\right) \\
 & = & cn + n(\lg{n}-1) + 34 \sum_{i=0}^{\lg{n}-1}2^i \\
 & = & cn + n\lg{n} - n + 34 \frac{1-2^{\lg{n}-1+1}}{1-2} \\
 & = & n\lg{n} + nc - n + 34 (n-1) \\
 & = & n\lg{n} + n(c + 33) - 34 \\
 & = & O(n\lg{n})
\end{eqnarray*}


      OPTIONAL PART (20 extra credit points): In addition to using the recursion-tree method above, use the (partial) substitution method to prove that T(n) is O(n log n). Remember that the substitution method involves 2 steps:
      1. (5 points) Guess a good upper bound. Explain why your guess makes sense.
      2. (15 points) Use mathematical induction to refine your guess (i.e., select the appropriate constants and/or add terms to your guess) and to show that the solution works.

    3. (30 points) Consider the recurrence: T(n) = T(n/3) + T(2n/3) + cn.

      • (15 points) Using the recursion-tree method (or as the textbook calls it, "unrolling" the recurrence), find a tight upper bound for T(n). In other words, use the recursion-tree method to find a function f(n) such that T(n) is O(f(n)). The tighter the upper bound you provide, the better.
        Notice that the recursion splits up the initial $ n $ into two pieces which add up to $ n $ itself. Because of this, the total contribution of each level in the "unrolled" recurrence will be $ cn $. This is demonstrated by examining the first few levels of the recurrence tree:

          \begin{eqnarray*}
L_0 & = & cn \\
L_1 & = & c\left(\frac{1}{3}n + \frac{2}{3}n\right) \\
    & = & cn \\
L_2 & = & c\left(\frac{1}{3}\left(\frac{1}{3}n + \frac{2}{3}n\right) + \frac{2}{3}\left(\frac{1}{3}n + \frac{2}{3}n\right)\right) \\
    & = & cn\frac{1}{3} + cn\frac{2}{3} \\
    & = & cn \\
    & ... & \\
L_i & = & cn
\end{eqnarray*}

        The depth of the recurrence tree, however, is slightly enigmatic. First to be precise let's first impose the following:

          $$ T(n) \leq c $$, if $$ n \leq 1 $$
          $$ T(n) = T\left(\frac{1}{3} n\right) + T\left(\frac{2}{3} n \right) + cn $$, otherwise

        Thus we have to find the number of levels of the recurrence tree before the base case is reached. The input to $ T $, however, changes at different rates depending on which of the recursive uses of $ T $ one considers. The one on the left in the definition of $ T $ will decrease much faster than the one on the right. Fortunately we are dealing here with upper bounds and thus let's consider the maximum depth of the recurrence tree and use this value with the assumption that over all the levels in between, the contribution of the nodes is $ cn $. This is certainly going to be an overestimate as really some parts of the recurrence tree stop short of this maximum level.

        We write $s_i$ as the size of the input to $ T $ at level $i$ for the most slowly decreasing branches of the tree and we can see the following:
          \begin{eqnarray*}
s_0 &  =  & n \\
s_1 &  =  & \frac{2}{3} n \\
s_2 &  =  & \left(\frac{2}{3}\right)^2 n \\
    & ... & \\
s_i &  =  & \left(\frac{2}{3}\right)^i n
\end{eqnarray*}

        Now we need to find how many levels occur before the base case of the recursion is reached

          \begin{eqnarray*}
                                   s_i & \leq & 1 \\
\Rightarrow \left(\frac{2}{3}\right)^i & \leq & 1 \\
\Rightarrow n                          & \leq & \left(\frac{3}{2}\right)^i \\
\Rightarrow \log_\frac{3}{2}n          & \leq & i
\end{eqnarray*}

        [Note: there is typo above: the 2nd line should be (2/3)i n ≤ 1.]
        Thus the contribution of this last level is $$ 2^{\log_\frac{3}{2}n} * c = 2^{\frac{\log_2 n}{\log_2 \frac{3}{2}}} * c \leq 2^{\log_{2} n} * c = cn $$. Remember that we are working to establish an upper bound and hence overestimating is perfectly fine.

        Now we have $ \log_{\frac{3}{2} n} - 1 $ levels each contributing $ cn $ and the bottom level contributing an additional $ cn $ to the final sum. We can thus write the expression for the sum total of the entire tree:

          \begin{eqnarray*}
 &   & cn + \left(\log _{\frac{3}{2}}(n) - 1\right) cn \\
 & = & \left(\log_{\frac{3}{2}}n\right) cn \\
 & = & \frac{c}{\lg{\frac{3}{2}}}n\lg{n} \\
 & = & O(n\lg{n})
\end{eqnarray*}
      • (15 points) Now, use the (partial) substitution method to find a tight upper bound for T(n). Remember that the substitution method involves 2 steps:
        1. (3 points) Guess a good upper bound. You can use the function f(n) that you obtained using the recursion-tree method above as your initial guess.
          Given we have shown that $ T(n) = O(n\lg{n}) $ we could set our guess at $ k n \lg{n} $ which would be fine but given the significant presence of fractions over three, let's use the log base 3 instead: $ k n \log_3{n} $.
        2. (12 points) Use mathematical induction to refine your guess (i.e., select the appropriate constants and/or add terms to your guess) and to show that the solution works.
          Let's assume that if $ m < n $ then we have $ T(m) = km\log_3{m} $. Thus $ T(n) $ is:
            \begin{eqnarray*}
T(n) & = & T(\frac{1}{3} n) + T(\frac{2}{3} n) + cn \\
     & = & \frac{1}{3}kn\log_3\left(\frac{1}{3}n\right) + \frac{2}{3}kn\log_3\left(\frac{2}{3}n\right) + cn \\
     & = & \frac{1}{3}kn\left(\log_3(n)-1\right) + \frac{2}{3}kn\left(\log_3(n) + \log_3(2) - 1\right) + cn \\
     & = & kn\log_3(n)\left(\frac{1}{3}+\frac{2}{3}\right) - \frac{1}{3}kn + \frac{2}{3}kn\left(log_3(2) - 1\right) + cn \\
     & = & kn\log_3(n) + \left(kn\left(\frac{2}{3}\left(log_3(2) - 1\right) - \frac{1}{3}\right) + cn \right) \\
     & = & kn\log_3(n) + \left(kn\left(\frac{2}{3}log_3(2) - 1\right) + cn \right) \\
\end{eqnarray*}

          To make this equal to what we would expect $ T(n) $ to be, we need to pick a $k$ that would cancel out the right part of the above expression or more precisely:

            \begin{eqnarray*}
             kn\left(\frac{2}{3}log_3(2) - 1\right) + cn & = & 0 \\
\Rightarrow k \left(\frac{2}{3}log_3(2) - 1\right)       & = & - c \\
\Rightarrow k                                            & = & \frac{-c}{\frac{2}{3}log_3(2) - 1} \\
                                                         & = & \frac{c}{1 - \frac{2}{3}log_3(2)} \\
                                                         & \approx & 1.726 c
\end{eqnarray*}

          Thus our refined solution to the recurrence is $ T(n) = \frac{c}{1 - \frac{2}{3}log_3(2)} n \log_3{n} \approx 1.726 c n \log_3{n} $. One last check is necessary to make sure that the base case of the recurrence agrees. If $ n \leq 1 $ we can see that $ \log_3{n} $ is actually negative and hence our solution for $ T(n) $ will not be above $c$.

    4. (25 points) Consider the following search problem:
          Input:
           - An array A of n positions, containing a sorted (in ascending order) sequence of 
             numbers. That is  A[1] ≤ A[2] ≤ ... ≤ A[n].
           - A value v.
          Output:
           - An index i such that v = A[i], OR
             the message "v is not in A" if v doesn't appear in A.
          
      Consider the following divide-and-conquer solution to this problem. This solution is called binary search: Compare the midpoint of the array A with v and eliminate half of the array from further consideration, until either v is found in A, or no elements in A remain under consideration.

      • (7 points) Provide a detailed algorithm to solve this problem.
        Consider the following algorithm:

          Binary-Search(A[1...n],v) // n = size of A //
          • Return Binary-Search-Aux(A,v,1,n)

          Binary-Search-Aux(A,v,i,j)
          • If i > j
            • Return "v is not in A"
          • Else
            • Let k = floor((i+j)/2)
            • If A[k]=v
              • Return k
            • Else If A[k] > v
              • Return Binary-Search-Aux(A,v,i,k-1)
            • Else // A[k] < v //
              • Return Binary-Search-Aux(A,v,k+1,j)
      • (6 points) Prove that your algorithm is correct (that is, it always returns the right answer).
      • The algorithm as described progressively narrows down the portion of A that is considered. To show the correctness of the method let's show the following three claims:

          Claim: At any point in the execution, if v appears in the array A, the correct index (i.e., the position of v in the array) is somewhere between i and j.
          Proof: Initially i=1 and j=n bounding the entire array A and for this case the right index is certainly somewhere in the claimed range. Next consider the half-point index k between i and j. Since the array A is sorted in an ascending order then the condition A[k] > v would certainly imply that the correct element cannot have an index larger than k as the array values there are all larger than A[k] which is already larger than v. Hence this case implies that the proper value must be in the other half of the array. Similar argument demonstrates the opposite when A[k] < v. Finally note the last case in which A[k] = v. Here the correct index is simply returned (thought this doesn't have much to do with the claimed statement).

          Claim: If there is an index i such that A[i] = v then the method will produce this index.
          Proof: Having established that the correct index is in the range between i and j at all times, we now note that the difference between those two decreases strictly after each new recursive call. Thus if the correct index is not a midpoint k, eventually i=j and hence the correct index will be returned as claimed.

          Claim: If there is NO index i such that A[i] = v then the method will output "v is not in A".
          Proof: Note that the only way for an index to be returned is if it the correct value is specifically found in the array. Next notice that after each successive iteration of the algorithm, the difference between i and j decreases. These two facts lead to the conclusion that eventually the condition that i > j will hold true. At such a point the algorithm will return "v is not in A" as claimed.
      • (12 points) Analyze the complexity of your algorithm. More precisely, provide a recurrence that captures the runtime T(n) of your algorithm, and provide a tight upper bound for T(n). Here, you can either use results proven in our textbook about upper bounds for your recurrence (make sure to mention precisely what result you are using and why it applies to your recurrence). If no result in our textbook matches your recurrence, use one of the methods (recursion-tree or partial substitution) to solve your recurrence.
        First we need to establish the meaning of the "size of input" as it is slightly unconventional in the above algorithm. The entire array is passed down recursively each time and in some respects the size of the input doesn't change. Assuming, though, that arrays can be passed by reference, the problem of large arrays is inconsequential.

        To establish some meaning of "size" we use the array portion that is actively being considered. This corresponds to array elements of index i through j in the algorithm. Initially those are 1 and n respectively and hence represent the entire array (size n). Next notice that the array is "split" by way of finding the middle index k and the recursive calls are made with index ranges of half the size of the original. Thus we can say that the "size" of the input is decreased by half each time a recursive call is made. There are no loops or any other structure other than the recursion and hence the computation performed is the recursion plus some constant. Thus we can write the run-time recursively as:

        • T(1) ≤ c
        • T(n) ≤ T(n/2) + c

          (note: The first of these merely shows that when i=j (size 1), some constant number of operations take place.)

        Now note that T(2) ≤ 2 T(n/2) + c ≤ 3 c . We can therefore say that the value of T(2) is also less than some constant (albeit a different one). Thus we can write:

        • T(1) ≤ c
        • T(2) ≤ c
        • T(n) ≤ T(n/2) + c

        This corresponds to Solved Exercise 1 in the course text (page 242) and is O(lg(n)).