CS 2223 Apr 20 2023
Classical selection: Mendelssohn Symphony No. 4 in A (Italian), (1833)
Musical Selection:
My Shot,
Lin-Manuel Miranda et al. (2015)
Visual Selection:
Diogenes
Sitting in his Tub, Jean-Leon Gerome (1860)
Visual Selection: Twist and Shout, The Beatles, 1964
Jazz Selection: Minor Swing, Django Reinhardt (1937)
1 BreadthFirst Search
1.1 HW4 Refinements
Make sure you get the latest version of the PDF for HW4. The numbers in the Histogram were off (and they are now correct). Also, when it comes to working with GPS location in the "lower 48" please use these conditions to determine when the GPS could be in the United States (though you still have to have to check the country value just to be sure....)
longitude must be >= -124.41 and <= -66.9499 latitude must be >= 24.5231 and <= 46.29204
1.2 Details
Breadth First Search works by maintaining the state of the search in a queue. We will investigate this behavior today. Note that in the prior lecture we showed Depth First Search which seemingly maintains less state, that is, just an array of marked vertices. However, once you recognize that dfs is a recursive function, you will realize that the state of the function call stack maintains state as well.
Keep in mind the behavior of the two types – stacks and queues – and you will see behaviors that appear in dfs and bfs. dfs will find a path (should it exist) between two vertices, but it makes no guarantees as to the length of that path. bfs, in constrast, also finds a path but now we have a firm guarantee that no other shorter path exists. Naturally, there may be other graphs of equal length.
Here is a brief code snippet showing Breadth First Search. It has nearly the same structure as the Depth First Search. Additionally it stores a distTo[] value for each vertex. Initially this is set to Positive Infinity because there may in fact be no path from s to each vertex.
void bfs(Graph G, int s) { marked = new boolean[G.V()]; edgeTo = new int[G.V()]; Queue<Integer> q = new Queue<Integer>(); for (int v = 0; v < G.V(); v++) distTo[v] = Integer.MAX_VALUE; distTo[s] = 0; marked[s] = true; q.enqueue(s); while (!q.isEmpty()) { int v = q.dequeue(); for (int w : G.adj(v)) { if (!marked[w]) { edgeTo[w] = v; distTo[w] = distTo[v] + 1; marked[w] = true; q.enqueue(w); } } } }
In addition to storing a marked array for which vertices have already been seen, it records a distTo array that records the current best shortest distance from s to that particular vertex.
Finally, we introduce a new edgeTo array that can be used to recover the path that was computed. When you see the expression edgeTo[w] this means "the vertex u from which the edge (u, w) was traversed." This array stores information "in reverse order" to let you compute the path that is of shortest distance from vertex s in terms of the total number of edges traversed from s.
Clearly it must be initialized to Positive Infinity to be prepared before the Breadth First Search begins.
Consider some properties of the algorithm:
The queue only contains vertices that have been marked.
Once a vertex is marked, it records the shortest distance to s.
What about the vertices in the queue? Do they all record the same shortest distance distTo value?
1.3 Proof of correctness
Breadth First Search will locate a shortest path (there may be several with the same shortest path) and it will do so in time proportional to the sum (V+E).
1.3.1 Timing Considerations
While there is a nested loop, consider first that the outer while loop will execute as long as the queue is not empty. Since only unmarked vertices are added to the queue, this will never execute more then V times; it may stop much earlier if the graph is not connected.
What about the inner for loop? It iterates based on the degree of each v, that is, based on the number of edges incident to v. Since this loop is executed once for every vertex ultimately reachable from s you can tell that the if !marked[w] statement will execute no more than 2E times; it may execute far fewer if the graph is not connected.
Thus there will be no more then V enqueue/dequeue operations and the if statement will execute no more than E times.
Do not be confused and think that the time is V * 2E! Rather this is additive, so the performance is in time directly proportional to V+2E which we simplify as we have shown earlier as V + E since the multiplicative constant 2 doesn’t matter in the long run.
1.3.2 Correctness
How can we claim that the Breadth First Search computes a shortest path between the source vertex s and any vertex t in the graph?
We can proceed inductively by setting N to the number of marked vertices.
In the base case, N=1 and the source vertes, s, is marked and has been enqueued. distTo[s] is set to 0, which is the correct value for the shortest distance from s to s.
Assume that (by induction) we know that a graph with N marked vertices has the correct distTo value for each of these vertices. So, consider the problem of N+1 marked vertices from this earlier solution. The only way to add a newly marked vertex w is to encounter the edge (u, w) during the inner for loop. Now at this point, the previously unmarked vertex, w, is marked, and we can rely on the fact that dist[u] is correct (based on our Inductive Assumption). Now, the shortest distance from s to w is the shortest distance from s to u plus 1 for the edge (u, w). Thus, once marked, dist[w] is the shortest path from s to w.
But I hear you might argue, what if you visit a previously marked vertex and you do so in fewer steps from s?
Well, to defuse that argument, consider the state of the vertices that are contained in the queue. They are all marked. But what of the distTo values associated with each vertex in the queue? Observe that each vertex enqueued has a distTo value strictly greater than the distTo value of the vertex that was dequeued. If you think about this for a moment, you can recognize that the distTo values associated with each vertex in the queue appear (from left to right) in a fixed monotonically equal or ascending order.
That is, the queue may contain multiple vertices with the same associated distTo value, but at no point will a vertex with a lower associated distTo value appear to the left of a vertex with a higher distTo value.
1.4 Recover Paths
Once the while loop in bfs has completed, all reachable vertices have been marked. Given any vertex that was marked, you can recover the path from the designated source vertex, s, by using the information in edgeTo. Consider the implementation of pathTo below:
public Iterable<Integer> pathTo(int v) { if (!marked[v]) { return null; } // no path possible Stack<Integer> path = new Stack<Integer>(); // build up a stack in reverse order. Stop once distTo[x] is 0. int x; for (x = v; distTo[x] != 0; x = edgeTo[x]) { path.push(x); } // Don’t forget the source vertex path.push(x); return path; }
1.5 In Class exercise
You will now perform a side by side comparison of DFS and BFS on a sample graph. Now you are to conduct both a BFS and a DFS on it from vertex 0. You are to stop the search once you mark vertex 7.
Along the way, you must draw a representation of the stack (for DFS) and queue (for BFS). Use the following graph:
1.6 Demonstration
I have prepared a number of search demonstrations using DFS, BFS and a third one I’ll introduce today called Guided Search.
The GuidedSearchAnimation attempts to improve the search process by selecting the marked vertex that is closest to the target point.
Note that the guidance here is flawed because distance between vertices is just part of the equation; you can only make progress by traversing edges. Nonetheless, this gives you a brief glimpse into more advanced search strategies that you will likely encounter in an AI class.
1.7 Version : 2023/04/28
(c) 2023, George T. Heineman