CS 2223 May 11 2020
1 Final Preparations
1.1 Data Structures
1.2 Types
1.3 Performance classifications
1.4 Mathematical Analysis
1.4.1 Best-Case and Worst-Case
1.4.2 Merge  Sort example, which has subtle differences in best/  worst cases
1.5 Algorithm Families
1.6 Daily Question
1.7 Version :   2020/  05/  11

CS 2223 May 11 2020

Lecture Path: 27
Back Next

Expected reading:
Daily Exercise:
Classical selection: Beethoven: Symphony No. 9 (1824)

Visual Selection:

Musical Selection: Questions, Moody Blues (1970)

Visual Selection: The Scream, Edvard Muchde (1893)

Daily Question: DAY27 (Problem Set In Canvas)

1 Final Preparations

Please take a few minutes before class begins to log into Canvas to complete the course evaluation. I use this information to improve my next course offering. You will have until 10PM tomorrow night, so don’t forget and try to take care of this suryey today.

HW4 is due today at 6PM.

1.1 Data Structures

"The time has come,"
the Walrus said, "To talk of many things:
Of shoes–and ships–
and sealing-wax–
Of cabbages–and kings–
And why the sea is boiling hot–
And whether pigs have wings."
Lewis Carroll

You are assumed to know the following basic structures:

You know when you should use these structures, and the implication of accessing aggregate data when stored in these structures.

You know about access performance in unordered arrays and linked lists.

Sample Question: You can locate the maximum value in an array of N elements in N-1 comparisons. Now assume you want to find the smallest and the largest value in the same array. What is the lower bound on the number of comparisons you need to make (i.e., the best case)? What is the upper bound on the number of comparisons you need to make (i.e., the worst case)? Can you provide sample instance problems with four elements to cover both these cases?

With linked lists, we saw how they were useful for storing loosely structured collections of values. They are used to implement Bag types, when there is no need to search through, but only retrieve all values in the Bag.

You have seen linked lists as they effectively implement a queue by maintaining two separate pointers, first and last.

Sample Question: Explain how to use a linked list to implement the stack data type.

1.2 Types

You should be well-versed in the basic data types used in this course. This includes:

These types all have common operations as well as specific ones. For example, priority queue and binary search tree both support a deleteMin operation.

Sample Question: How does a Min Priority Queue support decreaseKey operation?
And why is it hard to envision adding an increaseKey operations?

Sample Question: You are given a connected undirected graph with an even number of vertices, V, and an even number of edges, E. This graph can be split into two graphs G1 and G2, each of which contains half of the vertices and have of the edges from the original graph. True or false? If false, provide counter example. If true, explain your reasoning.

1.3 Performance classifications

We finally introduced the Big O notation as a means to classify the order of growth of a function, which typically represents the run-time performance of an algorithm or the exact number of times an operation executes. This provided the finishing touches on the performnance analysis that we conducted throughout the term.

You should be able to reflect on the performance families we have seen:

Sample Question: You are given a recurrence equation T(n) that is used to estimate the running time performance of an algorithm. You are told that T(N) = 2*T(N/3) + N/3. What is the overall classification using the above families of T(N)?

1.4 Mathematical Analysis

We have seen situations where we were concerned about counting the exact number of times an operation executed. Sometimes without knowing the exact input, it is only possible to determine the fewest number of times an operation executed (called the lower bound) or the maximum number of times an operation executed (called the upper bound).

1.4.1 Best-Case and Worst-Case

For BinaryArraySearch for example, given N integers in sorted ascending order in an array, we know that you can find (in the worst case) whether it contains a target integer in no worse than floor(log N) + 1 array inspections.

Note: you could try to claim that you need N array inspections by using a simple for loop, but this would not be "the best of the worst case" algorithms.

Thus to use "Big-Oh" notation O(g(n)) to classify the worst-case performance of Binary Array Search on a problem of size N, we would state that the worst case behavior is O(log N).

For this same problem, the best case is you would find the integer after just a single array inspection. In this case using Ω(g(n)) notation to classify the best-case performance of Binary Array Search on a problem of size N, we would state that best case behavior is Ω(1).

1.4.2 MergeSort example, which has subtle differences in best/worst cases

MergeSort is analyzed on page 272-273 of the book, and in my lectures on Apr 03 2020.

When faced with the final "merge" step in MergeSort one can see that in the best case, one of the sub-arrays contains values that are all smaller than the other sub-array, which means that the merge can complete with N/2 comparisons in the best case. In the worst case, one would need N-1 comparisons (if the values alternated with each other). This analysis was trying to count C(N) or the number of compare invocations needed to sort an array of length N.

In the best case (again assuming that N is a power of 2), we could write:

C(N) >= 2*C(N/2) + N/2

Using telescoping you get:

C(N) >= 2*C(N/2) + N/2
C(N) >= 2*[2*C(N/4) + N/4] + N/2
C(N) >= 2*[2*[2*C(N/8) + N/8] + N/4] + N/2

or

C(N) >= 23*C(N/23) + 3*N/2

in the general case:

C(N) >= 2k*C(N/2k) + k*N/2

and this can continue until k = log N. With a base case of C(1) = 0, this results in:

C(N) >= N*C(1}) + log(N) * N/2
C(N) >= 0 + (1/2) * log(N) * N

thus C(N) is Ω(N * log(N)) since this is the best case and we can’t do better than it.

Similarly, for the worst case, you have:

C(N) <= 2*C(N/2) + N - 1
C(N) <= 2*[2*C(N/4) + N/2 - 1] + N - 1
C(N) <= 2*[2*[2*C(N/8) + N/4 - 1] + N/2 - 1] + N - 1

or

C(N) <= 23*C(N/23) + 3*N - (4 + 2 + 1)

in the general case:

C(N) <= 2k*C(N/2k) + k*N - (2k-1)

and this can continue until k = log N. With a base case of C(1) = 0, this results in:

C(N) <= N*C(1) + log (N)*N - (N-1)
C(N) <= 0 + N*log(N) - (N-1)

Since (N-1) is much smaller than (N*log (N)) we can omit it from the classification scheme, and thus C(N) is O(N * log(N)) since this is the worst case, for which we will never do worse.

In summary, since MergeSort is classified as both Ω(N*log (N)) and O(N* log (N)) we can state categorically that it is Θ(N*log (N)).

1.5 Algorithm Families

We discussed a number of thematically related algorithms:

Given a directed graph, G, compute an undirected graph H in which an edge (u,v) exists in H if either the directed edge u > v or the directed edge v > u exists in G. What is the running time/performance of your algorithm?

1.6 Daily Question

The assigned daily question is DAY27 (Find in Canvas not Assistments)

If you have any trouble accessing this question, please let me know immediately on Piazza.

1.7 Version : 2020/05/11

(c) 2020, George Heineman