CS 2223 B Term 2013

Computer Science Department

CS2223 Algorithms
Homework 2 - B Term 2013

PROF. CAROLINA RUIZ

Due Date: Nov. 12, 2013 at 1:00 PM code, and 2:00 PM written report
See course policy regarding late homework submissions

Homework Instructions:

Read Chapters 2 and 3 of the textbook in detail.
Study the following resources from my previous offerings of this course:
(Important Note: Some of the links below are restricted to the WPI domain. So you need to be on the WPI network to be able to access them.)
- D term 2009
  - HW2 and HW2 Solutions
  - Exam 1 Solutions
- D term 2008
  - HW2 and HW2 Solutions
  - Exam 1 Solutions
- B term 2005
  - HW1 and HW1 Solutions
  - HW2 and HW2 Solutions
  - Quiz 1 Solutions
  - Exam 1 Solutions
See Useful facts about logarithms and exponentials.
See Useful theorems for comparing the rates of growth of functions.
This is an individual homework. No collaboration is allowed.
Remember to show your work and explain your answers. Correct answers without complete explanations won't receive full credit. Incorrect answers with explanations may receive partial credit.
Submission:
- Code Submission: Submit your Python code by 1:00 pm using the Blackboard site for CS2223 available from myWPI. Login to myWPI, go to CS2223 under "My Courses", then go to "Assignments", and submit your homework under "HW2".
- Submit a hardcopy of your written homework solutions by the beginning of class the day the homework is due.
- See the course webpage for the late homework submission policy.

Homework Problems:

(70 points) Problem 1. Consider the following problem: Given n sets S₀, ..., S_n-1 where each of the sets S_i is a SUBSET of {0,1,...,n-1}, determine if there is a pair of sets among the ones given, say S_i and S_j, that are disjoint (that is, don't have any elements in common).

The following algorithm solves this problem:

INPUT:
n: number of sets
S₀, ..., S_n-1, where S_i is a SUBSET of {0,1,...,n-1}.

OUTPUT:
True, if there is a pair of the input sets S_i and S_j that are disjoint (in this case, i  and j are printed).
False, otherwise. 


ALGORITHM: 


  
 cost of each instruction  
# of times instruction will be repeated 



 for each set S_i:
     for each other set S_j:
         disjoint = true
         for each element k of S_i:
             if k belongs to S_j:
                 disjoint = false
         if disjoint == true
             print 'Sets ',i,j,' are disjoint'
             return(true)

 return(false)

Below we will consider two different, alternative data structures to represent these sets, and see how T(n) is affected by the choice of data structure.

T(n) will be affected also by how you implement the "for each other set S_j" instruction. Note that it is enough to consider j > i only. [Hint: Studying the runtime analysis of InsertionSort in section 2.2 of the textbook will help you think of ideas to express the number of iterations of the instructions in the algorithm above.]

Implementation Version 1: Suppose that we use a Python list to represent each set. As an example, take n = 5 and S₀ = {2, 3, 0}, S₁ = {4, 2}, S₂ = {1, 4, 3, 0}, S₃ = {2}, S₄ = {}.
We will represent the list of these sets in Python as:
```
[[2, 3, 0], [4, 2], [1, 4, 3, 0], [2], []]
```
1. (10 points) Use worst case analysis to analyze the runtime of the algorithm if it is implemented using this data structure. That is, calculate the function T(n) for this algorithm by estimating the cost of performing each of the instructions above, and the number of times that each instruction would be executed in the worst case. Explain your work.
  In order to determine "if k belongs to S_j" in the algorithm above, traverse the list S_j element by element until either you find k, or you have finished traversing the list. That is, do not use any specialized Python functions to search a list (since you wouldn't know the time complexity of such a Python function to include in your runtime analysis).
2. (5 points) Provide a function g(n) such that T(n) = O(g(n)). Make the asymtotic upper bound g(n) as tight as possible.
3. (5 points) Write a rigorous mathematical proof showing that T(n) = O(g(n)).
Implementation Version 2: Suppose that we use a Python list to represent each set, but this time a set will be represented as a list of size n where position k of the list will be 1 if k belongs to the set, and 0 otherwise. As an example, take n = 5 and S₀ = {2, 3, 0}, S₁ = {4, 2}, S₂ = {1, 4, 3, 0}, S₃ = {2}, S₄ = {}.
We will represent the list of these sets in Python as:
```
[[1, 0, 1, 1, 0], [0, 0, 1, 0, 1], [1, 1, 0, 1, 1], [0, 0, 1, 0, 0], [0, 0, 0, 0, 0]]
```
1. (10 points) Use worst case analysis to analyze the runtime of the algorithm if it is implemented using this data structure. That is, calculate the function T(n) for this algorithm by estimating the cost of performing each of the instructions above, and the number of times that each instruction would be executed in the worst case. Explain your work.
  Once again, in order to determine "if k belongs to S_j" in the algorithm above, do not use any specialized Python functions to search a list. Use the obvious way to determine if k belongs to S_j that takes advantage of the set representation used in this Version 2.
2. (5 points) Provide a function g(n) such that T(n) = O(g(n)). Make the asymtotic upper bound g(n) as tight as possible.
3. (5 points) Write a rigorous mathematical proof showing that T(n) = O(g(n)).
Experimental Comparison of the two Versions:
1. (5 points) Implement each of the two versions of the the algorithm in Python as two separate functions inside the same Python program:
```
    def DisjointSetsVersion1(L):
        ...

    def DisjointSetsVersion2(L):
        ...
```
2. For each of the following values of n, n = 100, 200, 300, 400, 500:
  1. (5 points total) Randomly generate an input of size n. That is, a collection of n sets as described above.
  2. (5 points total) Convert this input into each of the two data representations (version 1 and version 2).
  3. (5 points total) Run each of the two functions on this input, timing its execution. Include the execution times in your report.
3. (10 points) Plot a graph where the x axis corresponds to n and the y axis corresponds to execution time. Include in this plot all the execution times recorded for the two functions and for each of the values of n. Use some convention (e.g., color) to distinguish between the execution times of DisjointSetsVersion1 and DisjointSetsVersion2 in the plot. Describe any observations you can make about this plot. Do the growth of these curves match the asymptotic behavior of these functions described by your Big-O analysis above? Explain.
Submit your code on myWPI.

(70 points) Problem 2. Rank the following functions by order of growth. That is, find an arrangement f₁, f₂, ..., f₆ of the functions satisfying f₁ = Ω(f₂), f₂ = Ω(f₃), ..., f₅ = Ω(f₆). Partition your list into equivalent classes such that functions f(n) and g(n) are in the same class if and only if f(n) = Θ(g(n)).
aⁿ
n^a
a^{n^a}
log_a(n)
a^a*n
n^log_a(n)
where a is a constant greater than 1.
Provide a detailed, rigorous proof of each part of your solution.
(20 points) Problem 3. Find a function f(n) and a function g(n) such that f(n) ≠ O(g(n)), f(n) ≠ Ω(g(n)), and f(n) ≠ Θ(g(n)). Explain your answer in detail.
(15 points) Problem 4. Write a formal mathematical proof of the following property of Big-O.
Transitivity of Big-O: If f(n) = O(g(n)) and g(n) = O(h(n)) then f(n) = O(h(n)).
(25 points) Problem 5. Consider the following sorting methods: bubble sort, insertion sort, merge sort, and quick sort, as described and implemented in the online textbook Problem Solving with Algorithms and Data Structures by Brad Miller and David Ranum.
1. Read the description of each of these 4 sorting methods.
2. Include in a single Python program the code provided on the aforementioned webpage implementing each of these 4 sorting methods.
  Note:If Python produces "RuntimeError: maximum recursion depth exceeded", then increase the number of recursion levels allowed by adding the following code to your program:
```
import sys
sys.setrecursionlimit(12000)
```
  I include here the skeleton of a Python program that I wrote which satisfies the requirements below.
3. Let n be the size of the list to be sorted. For each n, n = 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, and 10000, generate a list of n elements in decreasing order:
```
 
[n-1, n-2, ..., 2, 1, 0]
```
4. (10 points) Run each of the 4 sorting functions on this input, timing its execution. Include the execution times in your report.
5. (15 points) Plot a graph where the x axis corresponds to n and the y axis corresponds to execution time. Include in this plot all the execution times recorded for the 4 sorting functions and for each of the values of n. Use some convention (e.g., color) to distinguish between the execution times of the different sorting functions in the plot. Describe any observations you can make about this plot. Do the growth of these curves match the asymptotic behavior of these functions described by the Big-O analysis of these sorting methods on the webpage? Explain.
Submit your Python program on myWPI.

CS2223 Algorithms Homework 2 - B Term 2013

PROF. CAROLINA RUIZ

Homework Instructions:

Homework Problems:

CS2223 Algorithms
Homework 2 - B Term 2013