CS 2223 Nov 13 2015
Expected reading: 361-374
Daily Exercise:
If you do not expect the unexpected you will not find it, for it is not to be reached by search or trail.
Heraclitus
1 Fundamental Data Types
We are now ready to complete the last of the major data type families for this course. Much like the platonic solids from geometry, these are the fundamental building blocks for algorithms:
These data types are:
Bag (p. 121) – Use with collections of non-comparable objects. Use when you don’t care overmuch about individual retrieval of elements but rather only want to retrieve all elements one at a time.
Stack (p. 121) – Use when you want Last-in, First-out (LIFO) behavior. Can be structured to support expandable collections or can be restricted to fixed capacity.
Queue (p. 121) – Use when you want First-in, First-out (FIFO) behavior. Can be structured to support expandable collections or can be restricted to fixed capacity.
Max Priority Queue (p. 309) – Use when you want to retrieve specific element that is "largest value" or "highest priority". Can be structured to support expandable collections or can be restricted to fixed capacity. Can be augmented to support arbitrary re-classification of "value" or "priority" (IndexMinPQ (p. 320) which we will cover in few weeks).
Symbol Table (p. 363) – Use when you want to associate a value with a key.
We are going to introduce the Symbol Table type today and over the next lecture we will complete its implementation.
Note: For the record on HW2, I should have named the class UniqueCollection instead of UniqueBag because of the above attributes, namely, that a Bag should be considered a structure without any ordering. You had to add ordering to the linked list to meet the performance requirements of the homework assignment.
1.1 Symbol Table
Each symbol table has a type representing the key and a type representing the value. For example, to count word frequencies in text (p. 372) you might want to associate an Integer with a String.
Operation | Description |
put (Key key, Value value) | Associate (key,value) in table |
Value get (Key key) | retrieve value for key |
void delete (Key key) | remove (key,value) pair in table |
boolean contains (Key key) | check if table has key |
int size | return number of pairs |
boolean isEmpty | determine if empty |
To date, we have been concerned with collections for storing and retrieving individual items. Now we are contemplating an extension for storing associated pairs.
I claim that we already have the underlying data structures in place to properly implement the Symbol Table type.
1.2 Key Equality
These symbol tables are primarily concerned with testing equality of keys. When primitive data types are compared, the "==" operator is the default one used. In this case, each Key is a full Java object and we therefore assume that the Key class has an associated boolean equals (Object o) method that provides a semantic equality that goes beyond string equality.
1.3 Ordered Symbol Tables
We will begin to cover ordered symbol tables after the first exam.
1.4 Potential Implementation
We have covered enough structures to support the Symbol Table API. Today we will describe this in the context of linked lists that store additional information. This SequentialSearchST<Key, Value> implementation is from p. 375 of the book.
We have already seen how to use linked lists to store information, both ordered and unordered. The change here is to modify the structure of each node. The following defines the class and its inner Node class used to store the information.
public class SequentialSearchST<Key, Value> { int N; // number of key-value pairs Node first; // the linked list of key-value pairs // Nodes now store (key and value) class Node { Key key; Value value; Node next; public Node (Key key, Value val, Node next) { this.key = key; this.value = val; this.next = next; } } }
As you might imagine, we will build up linked lists of these (key, value) using put(key,value) operations, which only become a bit more complicated because you may be replacing a value that is already associated with key in the SequentialSearchST symbol table.
First observe that there is a useful constructor for creating Node objects from a (key, value) pair and a link to the next Node to use. Should you not wish to have the node have a next link, then simply pass null as the third parameter to this constructor.
Here is the put method implementation:
public void put(Key key, Value val) { Node n = first; while (n != null) { if (key.equals (n.key)) { n.value = val; return; } n = n.next; } // add as new node at beginning first = new Node (key, val, first); N++; }
The above while loop operates as we have seen for HW2. It visits each Node in the linked list to see if it is the one whose key matches the incoming key parameter. Should there be a match, then this is a request to reassociated the new value val with this existing key, so the value associated with that node in the linked list is updated and the function returns.
Should there be no match with an existing key in the linked list, then we must add a node node. This is done, here, by making it the new first node of the linked list. Don’t forget to increment N which keeps track of the number of items in the linked list.
1.5 Retrieve information
The get(key) method is even simpler than the put method. You simply traverse the linked list one at a time, trying to find the node whose key value matches the key parameter. If found, then return the associated value stored by that Node otherwise return null.
public Value get(Key key) { Node n = first; while (n != null) { if (key.equals (n.key)) { return n.value; } n = n.next; } return null; // not present }
1.6 Delete information
What if you want to remove a (key, value) pair from the symbol table? Then you would invoke the delete(key) method. To efficiently remove a node from a linked list, you need to know its previous node. But how can you do this if all of the Node objects only point to the next one?
public void delete(Key key) { if (first == null) { return; } Node prev = null; Node n = first; while (n != null) { if (key.equals (n.key)) { if (prev == null) { // no previous? Must have been first first = n.next; } else { prev.next = n.next; // have previous one linke around } return; } prev = n; // don’t forget to update! n = n.next; } }
1.7 Tilde Approximation
Assuming that there were N elements in the Symbol Table, what is your analysis of the running time of the core operations, put, get, and delete? State your answer in terms of N.
Let’s work on some more Tilde Approximation. The idea is to come up with an equation that can be used to compute the order of growth as N grows. Thus if you have a polynomial equation, you can eliminate the lower order terms regardless of their constants, because as N grows they will matter less and less.
Thus N3 + 1,000,000*N is going to be ~ N3.
Why? Because one N is larger than 100, N3 grows much, much faster than N.
Let’s look at some other examples (exercise 1.4.5, p. 208):
N+1
1 + 1/N
(1 + 1/N)*(1+2/N)
2N3 - 15*N2 + N
log (2N) / log(N)
log (N2+1)/log (N)
N100/2N
1.7.1 Code Tilde Approximation
You also need to be able to analyze code to be able to determine Tilde approximations right from the code. Here is some code that you should never, ever use for sorting. BubbleSort! *Gasp*
/** Bubble Sort elements. */ public static void sort(Comparable[] a) { int N = a.length; boolean swapped; do { swapped = false; for (int idx = 1; idx < N; idx++) { if (less(a[idx], a[idx-1])) { exch(a, idx, idx-1); swapped = true; } } N−−; } while (swapped); }
Let’s analyze performance using Tilde approximation (like p. 181).
1.8 Version : 2015/11/16
(c) 2015, George Heineman