CS 2223 Apr 15 2021
Musical Selection:
Tracy Chapman: Fast Car (1988)
Visual Selection:
Aficionado,
Georges Braque (1912)
Live Selection:
Creep, Radiohead (1994)
Daily Question: DAY14 (Problem Set DAY14)
1 Preparing for Exam 1
1.1 Start with question
You are given an array of N integers in arbitrary order. N is an odd number. The median value of this array is one of its elements, m, such that m is smaller than or equal to (N-1)/2 values in the array, and m is greater than or equal to (N-1)/2 values in the array. In the array [2, 5, 4, 6, 8, 1, 3, 7, 9], the value "5" is a proper median.
Develop some strategies for identifying the median value in an array. Our primary focus will be on minimizing the number of times we compare two values in the array, so we will also want to develop an equation C(n) that determines how many
There are no secrets to success. It is the result of preparation, hard work learning from failure.
General Colin Powell
Note: This problem requires more time to think about than granted on an exam. But it makes for a nice discussion point.
1.2 Fundamental Concepts
Brief review of the first half of this course, and fundamentals we have been practicing.
We have worked extensively with arrays, both one-dimensional arrays (often called vectors) as well as two-dimensional arrays (often called matrices), such as you found on HW1.
One dimensional arrays have nice search behavior when the values are in sorted order, since you can use BinaryArraySearch to locate a value in the array in floor(log N) + 1 array inspections/comparisons.
This is a fundamental fact we have used a number of times.
What if you had a square matrix?
int n = a.length; // Square n x n matrix for (int r = 0; r < n; r++) { for (int c = 0; c < n; c++) { if (a[r][c] == target) { return true; } } }
This would require n x n comparisons.
Sometimes a two-dimensional arrays is lower-triangular, that is there are N rows and N columns, but not all entries were used. Accessing these made a small modification to the for loop:
int n = a.length; // find number of rows for (int r = 0; r < n; r++) { for (int c = 0; c <= r; c++) { if (a[r][c] == target) { return true; } } }
For a given 4x4 matrix:
2 0 0 0 3 1 0 0 6 7 3 0 8 9 1 2
you had to check a total of 1 + 2 + 3 + 4 = 10 comparisons.
Given what you know of triangular numbers, this is 4*5/2, which in general would be n*(n+1)/2.
You are now aware of the five fundamental data types
Bag (p. 121) – Use with collections of non-comparable objects. Use when you don’t care overmuch about individual retrieval of elements but rather only want to retrieve all elements one at a time.
Stack (p. 121) – Use when you want Last-in, First-out (LIFO) behavior. Can be structured to support expandable collections or can be restricted to fixed capacity.
Queue (p. 121) – Use when you want First-in, First-out (FIFO) behavior. Can be structured to support expandable collections or can be restricted to fixed capacity.
Max Priority Queue (p. 309) – Use when you want to retrieve specific element that is "largest value" or "highest priority". Can be structured to support expandable collections or can be restricted to fixed capacity.
Symbol Table (p. 363) – Use when you want to associate a value with a key.
Note that the Max Priority queue can be augmented to support arbitrary re-classification of "value" or "priority" (IndexMinPQ (p. 320) which we will cover in few weeks). No need to know about indexMinPQ for exam tomorrow.}
You should know how stacks and queues can both be implemented using arrays or linked lists.
We covered how using arrays would work for a fixed queue of size N, effectively implementing a circular buffer.
1.3 Recurrence Relations
You have seen these in lecture a number of times. They appear, for example, as follows, in many of the divide and conquer algorithms:
Entire Lecture Apr 06 2021
Daily Exercise, Apr 05 2021
QuickSort (p. 293)
MergeSort (p. 272)
What matters is how you identify the operation being counted, and how you set up the equations.
Let’s try a past exam question:
static int process(int[] a, int lo, int hi) { if (lo == hi) { return (int) Math.sqrt(a[lo]); } int mid = lo + (hi-lo)/2; int x = process(a, lo, mid) + process(a, mid+1, hi); for (int i = lo; i <= hi; i += 2) { if (Math.sqrt(a[i]) == x) { x++; } } return x; }
If you are counting the number of times Math.sqrt is executed when invoked on an array of size N=2n, then the equation is:
S(1) = 1 (Base Case)
S(N) = S(N/2) + S(N/2) + N/2
since the for loop will execute N/2 times because i is incremented by 2 with each pass.
S(N) = 2*S(N/2) + N/2 and then you work out as we did in past lectures:
S(N) = 2*[2*S(N/4) + N/4] + N/2
S(N) = 2*[2*[2*S(N/8) + N/8] + N/4] + N/2
Regroup to find:
S(N) = 23*S(N/23) + 3*N/2
How many times can you subdivide N by 2? Exactly n = Log N times, which produces:
S(N) = 2n*S(N/2n) + n*N/2
But you know S(N/2n) is S(1) which means this is 1.
S(N) = 2n + n*N/2
and replace log N for n you get:
S(N) = N + log(N)*N/2
But the same logic can apply to algorithms which only reduce the problem size by one with each sub-problem:
SelectionSort, InsertionSort Apr 01 2021
In this case, the subproblems are not half the size, but just one size smaller:
public static void sort(Comparable[] a) { int N = a.length; for (int i = 0; i < N; i++) { int min = i; for (int j = i+1; j < N; j++) { if (less(a[j], a[min])) { min = j; } } exch(a, i, min); } }
How many times (in worst case) does less get invoked?
L(1) = 0 since the inner for loop would do nothing with an array of 1 value.
Note that the inner for loop calls less exactly N-1 times
We can write this as:
L(N) = (N-1) + L(N-1) = (N-1) + [N-2 + L(N-2)] = (N-1) + (N-2) + [N-3 + L(N-3)] = (N-1) + (N-2) + (N-3) + L(N-3) = 3*N - (1 + 2 + 3) + L(N-3)
How many times can you subtract 1 from N until you reach 1? N-1 times.
L(N) = (N-1)*N - (1 + 2 + 3 + ... + N-1) + L(N-(N-1)) = (N-1)*N - (1 + 2 + 3 + ... + N-1) + L(1) = N*(N-1) - TN-1
Where TN-1 is the triangle numbers (p. 185). Indeed, TK is equal to K*(K+1)/2, thus TN-1 is equal to (N-1)*(N-1+1)/2 = N*(N-1)/2, which means
L(N) = N*(N-1) - N*(N-1)/2 = N*(N-1)/2
1.4 Worst Case Analysis
We have focused our attention on counting the number of times a key operation is executed within an algorithm solving a problem of size N, where N is typically a power of 2.
Sometimes we can exactly determine the number of times an operation is called, much like L(n) above for Selection Sort.
Sometimes we can only bound it "above" and "below" using Best Case and Worst Case assumptions.
public static void sort(Comparable[] a) { int N = a.length; for (int i = 0; i < N; i++) { for (int j = i; j > 0 && less(a[j], a[j-1]); j−−) { exch(a, j, j-1); } } }
When it comes to Insertion Sort, how many times does less get called?
L(1) = 0, since the inner for loop never executes. For larger values of N, the inner loop will execute, but how many times?
L(N) = ... + L(N-1)
To determine how many times less is called, you have to consider the best case (the numbers are already in sorted order) or worst case (numbers are in reverse sorted order). In the worst case, the inner for loop executes i times, as i increases from 0 to N-1. We only need to consider cases where N > 1, since base case is handled earlier.
L(N) <= 1 + L(N-1) <= 1 + [2 + L(N-2)] <= 1 + [2 + [3 + L(N-3)]] <= 1 + 2 + 3 + L(N-3)
These are all "<=" because that is worst case analysis. How many times can you subtract 1 from N? N-1 times, so we have:
L(N) <= 1 + 2 + ... + N-1 + L(N-(N-1)) <= TN-1 +L(1) <= N*(N-1)/2
In the BEST case, you would only do one invocation to less, and thus this would be:
L(N) >= 1 + L(N-1) >= 1 + [1 + L(N-2)] >= 2 + [1 + [ + L(N-3)]] >= 1 + 1 + 1 + L(N-3) >= 1 + 1 + 1 + ... + 1 + L(N-(N-1))
Here, you would add 1 N-1 times, and so in the best case, only N-1 less comparisons are required.
Thus you know the actual number of invocations of less is defined as:
N-1 <= L(N) <= N*(N-1)/2
1.5 Alternate Question Revisited
A researcher proposes to design a data type which consiststs of a linear linked list of nodes, each of which stores an array of K individual comparable items in sorted order. This data type will offer the following operations:
insert (v) – insert v into the collection
remove (v) – remove v from the collection
contains(v) – check whether collection contains v
So we know the time for contains is not ~ log(N*K); but what is the worst case for this algorithm?
We will discuss. Code will be made available, together with performance results.
1.6 Exam 1 Tomorrow
Exams are closed-book, closed-notes. We will hold a single Zoom session with four breakout rooms. You will be automatically placed into one zoom room. Find the details on Canvas.
You may bring in one sheet of notes (one paper, 8.5" x 11.5", both sides) to each exam. At the start of the exam, you will turn on your camera so the proctors can see the environment in which you are taking the exam.
1.7 Daily Question
I haven’t done the daily question for today yet! I will do so right after class.
The assigned daily question is DAY14 (Problem Set DAY14)
If you have any trouble accessing this question, please let me know immediately on Discord.
1.8 Version : 2021/04/18
(c) 2021, George T. Heineman