Homework 4 Solutions - D Term 2009

**Homework Problems:****(40 Points) Problem 1**. Suppose that your MP3 player has a disk capacity of M megabytes. Also assume that you have the following n songs available for download: S_{1}, S_{2}, S_{3}, ..., S_{n}. The i^{th}song S_{i}takes up m_{i}megabytes. The problem is that your MP3 player capacity, M, is less than the sum of all the m_{i}'s. That is, M < Σ^{n}_{i=1}m_{i}.**(20 Points)**Assume that you want to maximize the number of songs that you download onto your MP3 player. Prove or give a counterexample: A greedy algorithm that selects songs to download in increasing size order (from smallest to largest) will produce an optimal solution.

**Solution:**Yes, this greedy strategy will produce an optimal solution. Here is the proof:The algorithm starts by sorting the songs in increasing size order. Assume that the resulting sorting is: S

_{g1}, S_{g2}, S_{g3}, ..., S_{gn}. Now, the algorithm will select the first k songs on this resulting list, S_{g1}, S_{g2}, S_{g3}, ..., S_{gk}, where k is such that Σ^{k}_{i=1}m_{gi}≤ M, but Σ^{k+1}_{i=1}m_{gi}> M.Assume by way of contradiction, that the solution produced by this greedy algorithm (S

_{g1}, S_{g2}, S_{g3}, ..., S_{gk}) is not optimal. Hence there must exist a different solution S_{d1}, S_{d2}, S_{d3}, ..., S_{dq}where q > k (that is, this solution contains more songs than the one produced by the greedy algorithm). Assume that the different solution is sorted in increasing size order: that is, m_{d1}≤ m_{d2}≤ m_{d3}≤ ... ≤ m_{dq}.Let's compare these two solutions position by position and let j+1 be the first position where the two sequences differ:

S _{g1},S _{g2},S _{g3},..., S _{gj},S _{g(j+1)},S _{g(j+2)},..., S _{gk}**greedy solution**S _{d1},S _{d2},S _{d3},..., S _{dj},S _{d(j+1)},S _{d(j+2)},..., S _{dk},..., S _{dq}**different solution**_{g(j+1)}because it was the smallest song not yet selected (i.e., smallest song not in S_{g1},..., S_{gj}). The different solution selected a different song S_{d(j+1)}and hence it must hold that m_{g(j+1)}≤ m_{d(j+1)}, and also that Σ^{j+1}_{i=1}m_{gi}≤ Σ^{j+1}_{i=1}m_{di}. So the greedy solution "stays ahead" of the different solution. We can continue this reasoning by induction and show that for each x, 1 ≤ x ≤ q, m_{gx}≤ m_{dx}, and Σ^{x}_{i=1}m_{gi}≤ Σ^{x}_{i=1}m_{di}.But now, the greedy algorithm stopped selecting songs after the k-th one. As stated above, this means that Σ

^{k}_{i=1}m_{gi}≤ M, but Σ^{k+1}_{i=1}m_{gi}> M. Nevertheless, the different solution picked more songs. However, Σ^{k+1}_{i=1}m_{di}≥ Σ^{k+1}_{i=1}m_{gi}> M. Hence, the different solution is not even a correct solution as it picked songs that exceeded the disk space limit M. This is a contradiction with the assumption that the different solution was indeed a solution better than the greedy one, and hence the greedy solution is optimal in terms of the number of songs selected.

**(20 Points)**Assume that you want to maximize the disk utilization of your MP3 player (that is, you want to use as many of your M megabytes as possible). Prove or give a counterexample: A greedy algorithm that selects songs to download in decreasing size order (from largest to smallest) will produce an optimal solution.

**Solution:**This greedy strategy is not optimal. Consider the following counterexample. Let the sizes of the songs be 5, 4, 3, and 1.5 megabytes. Let the remaining disk space in your MP3 player be 10 megabytes. The greedy strategy would choose the songs with sizes 5 and 4. The disk utilization will be 9 megabytes (that is 90%). However, choosing the songs with sizes 5, 3, and 1.5 would be a better solution as it would use 9.5 megabytes (that is, 95%).

**(160 points) Huffman's Optimal Prefix Codes Algorithm**

Before you can solve this exercise, you need to read Sections 2.5 and 4.8 of the textbook in detail.

**Solution:**Included below are the files containing Shweta Srivastava's Java implementation of heaps, priority queues, treeNodes, and Huffman's algorithm: