Lab 5 (Advanced): Infinite Trees
Note: This lab is longer than you will finish in 50 minutes. I’ve put all the exercises here so that you get a comprehensive overview of the problem, should you need to come back to a problem like this in the future. Just get as far as you can. You’ll get a good overview of the topic by finishing through section 4.
1 Motivation
Imagine that you wanted to capture the possible changes of position that a player in a game or robot in a simulation environment could take over time. A tree or graph would be a good choice: each node could capture a player or robot’s current location (and some other attributes which we will ignore here); each child or edge captures a potential next location. The following tree corresponds to a simple grid-based game in which a player can only move between adjacent squares (ie, forward, backward, left, right), and each square is represented by a pair of numbers reflecting its coordinates.
------- <4,8>------ |
/ / \ \ |
<4,9> <4,7> <3,8> <5,8> |
/ / \ \ |
----- / \ ------- |
/ / \ \ |
<3,9> <3,7> <2,8> <4,8> |
Imagine that you wanted to be able to plan future moves by searching in this tree. Can you build the entire tree up front and then search in it? No! The set of possible positions is at best large, and at worst (effectively) infinite. Instead, we need to create the tree in a demand-driven fashion: we create each node when someone needs it, remembering just enough information to let us create more nodes as necessary.
This lab teaches you how to create and program over trees with infinitely many nodes.
2 Modeling Infinite Trees
To simplify the problem, let’s work with binary trees with an integer at each node. The left child of each node is the node with data value minus 1; the right child is the node with data value plus 1. In other words:
0 |
/ \ |
-1 1 |
/ \ ... |
-2 0 |
... ... |
Let’s start with a basic definition of binary trees and see how well it fits our new problem (code in a starter file):
interface INode {} |
|
class Leaf implements INode { |
Leaf(){} |
} |
|
class Node implements INode { |
int data; |
INode left; |
INode right; |
|
|
Node (int data, INode left, INode right) { |
super(data); |
this.left = left; |
this.right = right; |
} |
} |
|
class Examples { |
INode myTree = new Node(4, |
new Node(3, new Leaf(), new Leaf()), |
new Leaf()); |
} |
Following conventional CS terminology, we will call unbuilt nodes "promises":
class NodePromise implements INode { |
int data; |
|
NodePromise(int data) { |
this.data = data; |
} |
} |
class Examples { |
INode myTree = new Node(4, |
new NodePromise(3), |
new NodePromise(5)); |
} |
Exercise: Edit the starter binary tree implementation to use NodePromise instead of Leaf. Make sure the example given above compiles and runs.
To demand that a node be created, we "force" the promise of a node to be carried out ("force" is again the conventional CS term here). In particular, our NodePromise class will have a force method that builds-out one more node of the tree:
private INode force() { |
return new Node(this.data, |
new NodePromise(this.data - 1), |
new NodePromise(this.data + 1)); |
} |
Exercise: Add the force method to your code.
Exercise: Create a new NodePromise with 0 as its data. Use calls to force to get a tree with complete nodes (not promises) for numbers 0, -1, 1, and 2.
Exercise: What should a size method compute on an infinite tree? Don’t write any code for now (that’s coming) – just take a moment to think about what output you would expect if you computed the size of a tree with NodePromise nodes.
3 When to Force Node Creation
Now that we have force, we need to decide when to use it. Our goal is to construct the tree "on demand", as a user requests access to additional nodes. However, users should not have to call force. Instead, the user should have the illusion of having an infinite tree, leaving it up to our code to force creation of nodes when needed to maintain the illusion.
Notice that Node has fields for left and right, while NodePromise does not. If a user has a variable of type INode and wants the left subtree of that node, the user cannot simply write mynode.left. Instead, the INode interface needs methods getLeft() and getRight() that will get the corresponding subtree. For Node objects, these methods just return the corresponding (existing) node. For NodePromise objects, these methods force creation of the new nodes:
interface INode { |
INode getLeft(); |
INode getRight(); |
} |
|
class Node extends AbsNode { |
private INode left; |
private INode right; |
|
// constructor omitted |
|
public INode getLeft() { return this.left; } |
public INode getRight() { return this.right; } |
} |
|
class NodePromise extends AbsNode { |
// constructor omitted |
|
private INode force() { |
return new Node(this.data, |
new NodePromise(this.data - 1), |
new NodePromise(this.data + 1)); |
} |
|
public INode getLeft() { return this.force().getLeft(); } |
public INode getRight() { return this.force().getRight(); } |
} |
Exercise: Add a getData method to nodes that returns the data at each node. To convince yourself that getLeft and getRight hide details from a user, experiment with the following expressions:
> NodePromise root = new NodePromise(0); |
> root.getLeft().getData() |
-1 |
> root.getLeft().getLeft().getData() |
-2 |
> root.getLeft().getRight().getLeft().getData() |
-1 |
> root.getLeft().getRight().getRight().getData() |
1 |
Exercise: Just from the behavior of these examples (without looking at the code), how do you know that your force method isn’t accidentally creating the entire infinite tree?
4 Computing the Size
Here is a proposed method for computing the size of an infinite tree:
abstract class AbsNode implements INode { |
int data; |
|
// constructor omitted |
|
public int getData() { return this.data; } |
|
public int size() { |
return 1 + getLeft().size() + getRight().size(); |
} |
} |
Exercise: Use this method to compute the size of one of your infinite trees. What happens and why?
Exercise: Add a size method to your infinite trees that actually terminates. What did you need to do to make that happen? Can you be sure that your program will terminate on all inputs? Write down a precise justification for why your program will always terminate.
5 Looking Ahead in the Tree
Exercise: One common operation on game trees is to look ahead a few steps to help determine which of several moves to take (assuming you had more data than just a number in each node). Write a lookAhead function on INode that takes an integer indicating how many levels of concrete tree are required, and returns a list of the possible nodes at that depth. (A real game program would then perform some computation over that list of nodes to select which move to make next.)
To get you started, here are the interface and examples:
interface INode { |
int getData(); |
INode getLeft(); |
INode getRight(); |
IList<INode> lookAhead (int depth); |
int size(); |
} |
Exercise: Where does the lookAhead method body differ between Node and NodePromise? Be sure you have abstracted over any similarities to share common code between the two classes.
Exercise: Write a series of checkExpect tests that combine lookAhead and size to confirm that lookAhead is behaving properly.
6 From Trees to Graphs through Memoization
We noted earlier that our "tree" is really a graph, in that there are multiple ways to get from the root of the tree to a node with a given value. Each node we create with a specific number as its data has the same children. Shouldn’t we save resources and use only one copy of the node every time it arises in the tree?
The way to do this is through a technique called memoization. Every time a memoized function is called, the function stores the input it was given along with the output it has computed. The next time the same function is called with the same input(s), the function simply returns the previously-computed answer rather than compute it afresh.
Hashtables are a good data structure to use in memoization. For creating infinite trees, we want a hashtable in which the keys are the data (numbers) in nodes and the values are the corresponding Node (not NodePromise) objects. The force method uses the hashtable: if a forced node already exists in the hashtable, force returns it; otherwise, force creates the new Node and stores it in the hashtable before returning it.
Exercise: Add a hashtable as described above, so that there is only one unique Node for each data number. Part of the challenge is figuring out where to put the hashtable (you need just one for the whole data structure). If you can’t figure that out, work ahead to the next section then come back and finish this up.
Exercise: Figure out how to test that your code really does produce only one unique Node per number.
7 Encapsulating Infinite Trees
When we discussed encapsulation, we talked about the importance of providing operations that access a data type without revealing the internals of that data structure. In that spirit, we want to create a class called GameTree which contains a field of type INode (for the current tree).
The constructor for GameTree should take the number to use at the root of the tree. The constructor should set its INode field to a new NodePromise with the given number as its value.
Exercise: Encapsulate infinite trees in a GameTree class as described here.
Exercise: Provide a second constructor for GameTree that takes no arguments, and initializes the INode field to a NodePromise with value 0. Having multiple constructors can provide flexibility when different contexts would use your program in different ways.
8 Summary
What you’ve seen here is a powerful programming concept known as laziness. In lazy computation, we compute only what is needed, extending the data (using force) if more cases are required. If you think about it, force is a rather different function than ones you have seen before. Most functions simply provide names for computations. When we give them arguments, they substitute those arguments into the function body and return a result. In contrast, force exists to delay a computation (namely, the construction of the tree). It doesn’t take arguments. Those of you who came through 1102 saw this concept before through lambda. The core idea is important in computer science overall, however: functions do not always abstract over behavior; they also can set up a computation to be performed later.
The main idea in this lab is that we can support infinite data by defining classes and methods to build more data lazily, as it is needed. This is a powerful tool that applies to many programming contexts.