CS 2223 B Term 2005 - HW6 Solutions

Computer Science Department

CS2223 Algorithms
Homework 6 Solutions - B Term 2005

By Prof. Carolina Ruiz and David Toth

Problem 1 - Word Segmentation

Part A
Given a input sequence y₁y₂...y_n of length n, and a function quality that given any sequence of letters returns the quality of the sequence (the larger the value returned, the better the quality).
The problem is to compute the optimal segmentation of the input string into words such that the sum of these words' qualities is the highest possible. In order to compute the optimal segmentation, we follow a dynamic programming approach. We define the optimal segmentation recursively as follows:
Let OPT(j) be the following function defined recursively, for 0 ≤ j ≤ n:
```
 
For j=0: OPT(0) = 0

For j > 0:  OPT(j) =     max      { OPT(k-1) + quality(y_k...y_j) }
                      1 ≤ k ≤ j 
```
This recursive definition states that the optimal way to segment the input string is by finding the best possible split of the input string into two parts: a right-hand side part that will consist of just one word, and a left-hand side part that will be a sequence which will be segmented recursively.
Part B: An algorithm is:
```
{
  OPT[0] := 0;

  for (j := 1 to n) {
     temp := -infinity
     for (k = 1 to j) {
        if temp < OPT[k-1] + quality(y_k...y_j)) then {
           temp := OPT[k-1] + quality(y_k...y_j)) 
        }
     OPT[j] := temp 
  }
}
```
Then you can walk backwards through the OPT[] array, and whenever you see the value go up between x and x-1 as you go backwards, split the word between x-1 and x. That recovers the segmentation of the string into words. Dave has implemented this in Java so you can see a more concrete version, in addition to the algorithm. If you run this, you should pass in bytheway as the command line argument. The program is general enough to handle any other strings in every way except the quality() method is only prepared to handle the English words found in bytheway. You could modify that if you wanted and then you could pass in other strings on the command line.
Part C (optional)
We use two nested for loops (the outer runs from 1 to length of the string and the inner runs from 1 to the outer index) and all the operations inside the loops take constant time. Note that for each value of j in the outer loop, the inner look is executed j times. Each execution of the inner loop takes constant time. Hence the execution of both loops combined takes:
1 + 2 + 3 + ... + n = (n * (n + 1))/2 = (n² + n)/2 steps.
Hence, the runtime of the algorithm is O(n²).

Problem 2 - Stocks

Part A
As discussed in class, we can define a recursive equation to solve this problem. The equation is:
```
OPT(j) = max { 0, ( OPT(j-1) + (p(j) - p(j-1))) }
```
Intuitively, OPT(j) represents the maximum profit that can be obtained by selling the stocks on day j.
To provide some intuition behind this recursive equation, note that we want the biggest low to high difference between any two numbers X and Y where Y is some place after X. Note than any number Z in between X and Y that is lower than X actually is X because otherwise Z to Y would have a bigger increase. If Z was greater than Y, then Z would have to be Y or we would have a larger increase between X and Z. Therefore, our optimal buy/sell timing is in some interval in the graph of stock prices. We stop an interval before it gets lower than the start of the itself. Otherwise, we would run into the X = Z case.
The idea is to divide the graph of stock prices into intervals that begin with a number B and end at the number E immediately before a number less than the beginning of the interval. Start the next interval with the number after E. That lets you see the biggest price increase that can happen in each interval. We keep a temp with the current biggest increase, only replacing it when there's a bigger increase.
Our dynamic programming solution to this problem implements the ideas above by filling in the "optimal" array with the profit you'd make by buying on the lowest day in an interval and selling on the current day. When selling on the current day would cause you to have a net loss, that slot in the array gets a 0, which starts a new interval. After this is done, we scan the optimal array and find the greatest value. The index of this is the sell date and if we step back through the prices array and find the lowest value before the sell date, the corresponding index is the buy date.
Our algorithm works because it always outputs a pair of days to buy and sell on, or the fact that you can't make a profit if the stock price never increases in the given time period. Assume there's a higher increase than the one our algorithm found. Then there are two numbers X and Y with Y some place after X, which forms some interval. But we already explored each interval, keeping track of the biggest one, so if X,Y was the interval in the graph with the biggest increase, we would have stated that. Thus, since we didn't, the interval X,Y cannot be the one with the biggest increase.
Part B:
We have implemented our algorithm as shown in this file.
Part C

The algorithm runs 3 separate for loops. Each loop has d iterations (where d = days we have prices for) and each instruction inside the loops is constant time and there are no function calls inside the loops, so since the instructions inside the loop take finite time, the algorithm terminates in a finite amount of time and outputs an answer.
There are also a few initialization and cleanup instructions that take constant time on either end, giving us:
C_init + (C1_loop_instructions * n) + (C2_loop_instructions * n) + (C3_loop_instructions * n) + C_cleanup = C_init + (C4_loop_instructions * n) + C_cleanup.

Hence, the runtime of the algorithm is O(n).

CS2223 Algorithms Homework 6 Solutions - B Term 2005

By Prof. Carolina Ruiz and David Toth

Problem 1 - Word Segmentation

Problem 2 - Stocks

CS2223 Algorithms
Homework 6 Solutions - B Term 2005