### CS2223 Algorithms Homework 4 - D Term 2008

#### PROF. CAROLINA RUIZ

Due Date: April 8, 2008 at 1:00 PM
See course policy regarding late homework submissions

• Homework Instructions:
• Read Chapter 4 of the textbook in detail.
• See Useful logarithm and exponential facts.
• This is an individual homework. No collaboration is allowed.
• Submit your code (i.e., your solution to Problem 3) using turnin. by 12:30 pm on the day the homework is due. We suggest to zip your program file with any auxiliary files it needs to run. The name of the homework in turnin is HW4.
• Submit a hardcopy of your written homework solutions to Problems 1 and 2 by the beginning of class (i.e., 1:00 pm) the day the homework is due.
• See the course webpage for the late homework submission policy.

• Homework Problems:

1. (100 Points) Problem 1. Assume that you are hired by a coffee company to design an algorithm that will help them arrange coffee shippings. The company distributes several different types of coffee, each one with a certain price per pound. For each shipping, your algorithm will receive as input the following information:
```W: Maximum number of coffee pounds that the delivery van can carry.
n: Number of different types of coffee that can be shipped.
(Let's call these n coffee types T1, ..., Tn.)
weight: a 1-dimensional array of size n of integers
cost_per_pound: a 1-dimensional array of size n of reals
such that for each i, 1 ≤ i ≤ n:
weight[i]= maximum number of pounds of coffee type Ti that can be shipped
cost_per_pound[i]= cost of one pound of type Ti coffee
```
Assume that the business is going so well that the company has more coffee to ship than what the delivery van can carry. Hence, your algorithm needs to determine what coffee types, and how much coffee of each of these types, to pack in the van to maximize the total cost of the coffee shipped. That is, your algorithm needs to produce as output:
```ship: 1-dimensional array of size n of reals
such that for each i, 1 ≤ i ≤ n:
ship[i]= amount in pounds of coffee type Ti included in the shipping
(this amount doesn't have to be an integer. For example,
your algorithm might find that for a certain type of coffee
Ti the best amount to pack is 3.75 pounds. In this case the
cost of ship[i] would be 3.75*cost_per_pound[i].
Or if your program determines that this coffee type shouldn't
be shipped at all, then ship[i]=0.)

satisfying:
ship[i] &le weight[i],
Σni=1 ship[i]≤ W, and
Σni=1 ship[i]*cost_per_pound[i] is as large as possible.

```
You assignment is to write a greedy algorithm (in pseudo-code) to solve this problem. With this goal in mind, follow the steps below.

1. (15 Points) Let's determine what selection criterion we can use greedily to produce an optimal solution. Consider the following example:
```W=100, n=5.
T1     T2     T3     T4     T5
weight:	        10     20     30     40     50
cost:           20     30     66     40     60
cost_per_pound: 2.0    1.5    2.2    1.0    1.2
```

• (5 Points) If weight is used as the selection criterion (that is, you pack first as much coffee as you can of the type with the most weight), what would be the amount of each coffee type selected by a greedy algorithm that uses this selection criterion? What would be the total sales price of the resulting shipping?

• (5 Points) If cost is used as the selection criterion, what would be the amount of each coffee type selected by a greedy algorithm that uses this selection criterion? What would be the total sales price of the resulting shipping?

• (5 Points) If cost_per_pound is used as the selection criterion, what would be the amount of each coffee type selected by a greedy algorithm that uses this selection criterion? What would be the total sales price of the resulting shipping?

• Which of these 3 selection criteria above produced the best result?

2. (35 Points) Write your greedy algorithm (in pseudo-code) to solve this problem using the selection criterion that worked the best in the example above. Provide as much explanations as possible to make it easy for others to understand your algorithm. Assume that your algorithm uses a priority queue to select the next best value (according to the selection criterion) to be used.

3. (25 Points) Prove that your algorithm is correct. That is, prove that it always produces the optimal answer for the given input.

4. (25 Points) Analyze the time complexity of your algorithm. Provide a detailed runtime function T(n) for your algorithm and a function g(n) such that T(n) = O(g(n)). Assume that your algorithm uses a priority queue to select the next best value (according to the selection criterion) to be used.

2. (45 Points) Problem 2. Consider an alphabet consisting of symbols a, e, i, o, u. Assume that these symbols occur with the following frequencies in a given corpus: 1/2, 1/4, 1/8, 1/16, and 1/16 respectively.

• (30 Points) Follow the Huffman algorithm to obtain an encoding of these 5 symbol alphabet. Show your work.

• (5 Points) What's the encoding of the word "auuoi"?

• (10 Points) If the encoding is applied to a file consisting of 1 million occurrences of these 5 symbols with the given frequencies, what is the length of the encoded file in bits?

3. (155 Points + 5 extra points) Problem 3. In this problem you are asked to implement Dijkstra's algorithm to find shortest paths in a directed, weighted graph. Your implementation should satisfy all the specifications below:

• Your program should be written in Java or in C. Choose one.

• (20 Points) Your program should be a faithful implementation of the version of the Dijkstra's algorithm presented in class, Section 4.4 of the textbook, and the textbook slides.

• (10 Points) Input specification. Your program must read the input graph from a file. This input file describes the structure of the graph and the weights (costs, distances) of the edges between nodes. Nodes will be denoted by (consecutive) integer numbers, starting with 0.
• The first line of the file contains an integer equal to the number of nodes in the graph.
• The second line of the file contains an integer equal to the node that should be used as the source/start of the algorithm.
• Each subsequent line contains all the information about one edge. Each of these lines has 3 fields separated by a whitespace. The first field is a node label (integer), the second field is also a node label (integer), and the third field is the length or cost (positive, float value) of the edge between the node in the first field and the node in the second field, in this direction.
In total, the file will contain as many lines as there are edges in the graph, plus 2. You may also assume that if a node label doesn't appear in any of the edge lines, then that node has no incident edges (i.e., this node has degree 0). An example of such input file is given below for the graph in Figure 4.7 of the textbook (page 139) where the 6 nodes have been renamed as follows: y=0, u=1, s=2, x=3, v=4, z=5.
```------------------------------------
6
2
2 1 1.0
2 3 4.0
2 4 2.0
1 0 3.0
1 3 1.0
4 3 2.0
4 5 3.0
3 0 1.0
3 5 2.0

------------------------------------
```
Please note that there is a newline character after "3 5 2.0".

• (90 Points) Data Structures: Adjacency Lists, Heaps and Priority Queues. Your program should use the following data structures:

• (10 Points) An adjacency list to represent the input graph. This adjacency list should contain, for each node in the graph, the list of edges that exit that node, with each of these edges annotated with its cost. For example, the entry in the adjacency list for node 4 above would look like:
```        ---------        ---------
4 ---> | 3 | 2.0 | ---> | 5 | 3.0 |
---------        ---------
```

• (65 Points) A priority queue PQ to handle the nodes under consideration by the Dijkstra's algorithm. The key of each node in this priority queue is the cost of the shortest path known so far from the start node to said node.

Your program should implement Heaps and based on them, Priority Queues, satisfying all the specifications below.

• (10 Points) The heaps should rely only on the array data structure in the programming language that you chose. The priority queues should rely only on the heap data structure you implemented. Using heaps and/or priority queues provided in the programming laguage that you chose is not allowed. You need to write your own implementations.

• (20 Points) Your implementation of heaps should include the following functions faithfully implemented from their descriptions in Section 2.5 of the textbook.
• (10 Points) Heapify-up(H,i)
• (10 Points) Heapify-down(H,i)

• (35 Points) Your implementation of priority queues should include the following functions faithfully implemented from their descriptions in Section 2.5 of the textbook.
• (5 Points) StartHeap(N)
• (5 Points) Insert(H,v)
• (5 Points) FindMin(H)
• (5 Points) Delete(H,i)
• (5 Points) ExtractMin(H)
• (10 Points) ChangeKey(H,v,alpha)

• (5 Points) A 1-dimensional array d[u] containing the cost of the shortest path known so far from the start node to node u.

• (5 Points) A 1-dimensional array P[u] containing the previous node to u in the shortest path known so far from the start node to node u.

• (10 Points) Output specification. Your program should print out one line per node in the graph. In the line for node u, three parts should be printed separated by whitespaces: the number corresponding to node u; the cost of the minimum path from the start node to u (or "infinity" if there is no path between them); and the actual path from the start node to u (or "no path" if there is no path between them). For example, if u=5 and the start node is 2, as in the sample input above, then the output line for u should be:
```5 4 2->1->3->5
```

• (30 Points)Testing. Your code will be tested with several different input files similar to the following: