1 Motivation
2 Modeling Infinite Trees
3 When to Force Node Creation
4 Computing the Size
5 Looking Ahead in the Tree
6 From Trees to Graphs through Memoization
7 Encapsulating Infinite Trees
8 Summary

Lab 5 (Advanced): Infinite Trees

Kathi Fisler

Note: This lab is longer than you will finish in 50 minutes. I’ve put all the exercises here so that you get a comprehensive overview of the problem, should you need to come back to a problem like this in the future. Just get as far as you can. You’ll get a good overview of the topic by finishing through section 4.

1 Motivation

Imagine that you wanted to capture the possible changes of position that a player in a game or robot in a simulation environment could take over time. A tree or graph would be a good choice: each node could capture a player or robot’s current location (and some other attributes which we will ignore here); each child or edge captures a potential next location. The following tree corresponds to a simple grid-based game in which a player can only move between adjacent squares (ie, forward, backward, left, right), and each square is represented by a pair of numbers reflecting its coordinates.

     ------- <4,8>------

    /        /   \      \

<4,9>    <4,7>   <3,8>  <5,8>

                / / \ \

           ----- /   \ -------

          /     /     \      \

      <3,9>   <3,7>  <2,8>   <4,8>

(Ignore that we have <4,8> twice in this picture for now).

Imagine that you wanted to be able to plan future moves by searching in this tree. Can you build the entire tree up front and then search in it? No! The set of possible positions is at best large, and at worst (effectively) infinite. Instead, we need to create the tree in a demand-driven fashion: we create each node when someone needs it, remembering just enough information to let us create more nodes as necessary.

This lab teaches you how to create and program over trees with infinitely many nodes.

2 Modeling Infinite Trees

To simplify the problem, let’s work with binary trees with an integer at each node. The left child of each node is the node with data value minus 1; the right child is the node with data value plus 1. In other words:

       0

      / \

    -1   1

   /  \  ...

 -2    0

...    ...

Let’s start with a basic definition of binary trees and see how well it fits our new problem (code in a starter file):

  interface INode {}

  

  class Leaf implements INode {

    Leaf(){}

  }

  

  class Node implements INode {

    int data;

    INode left;

    INode right;

  

  

    Node (int data, INode left, INode right) {

      super(data);

      this.left = left;

      this.right = right;

    }

  }

  

  class Examples {

    INode myTree = new Node(4,

                            new Node(3, new Leaf(), new Leaf()),

                            new Leaf());

  }

This isn’t a great representation, because a Leaf denotes the end of the tree. In our desired problem, the tree never ends, it just hasn’t necessarily been built yet. There is also no notion here that a Leaf has data, but not (yet) children. So this shape doesn’t capture the essence of our problem. Instead, we need classes that capture the "unfinished" nature of some nodes. We will omit leaves entirely.

Following conventional CS terminology, we will call unbuilt nodes "promises":

  class NodePromise implements INode {

    int data;

  

    NodePromise(int data) {

      this.data = data;

    }

  }

Our prior example would be written as following using NodePromise:

  class Examples {

    INode myTree = new Node(4,

                            new NodePromise(3),

                            new NodePromise(5));

  }

That is, the tree has one completed node (with data 4), and two "pending" nodes (for 3 and 5) that we might create on demand.

Exercise: Edit the starter binary tree implementation to use NodePromise instead of Leaf. Make sure the example given above compiles and runs.

To demand that a node be created, we "force" the promise of a node to be carried out ("force" is again the conventional CS term here). In particular, our NodePromise class will have a force method that builds-out one more node of the tree:

  private INode force() {

    return new Node(this.data,

                    new NodePromise(this.data - 1),

                    new NodePromise(this.data + 1));

  }

So if we were to run myTree.left.force(), we would get a Node with data 3 and NodePromise children for values 2 and 4.

Exercise: Add the force method to your code.

Exercise: Create a new NodePromise with 0 as its data. Use calls to force to get a tree with complete nodes (not promises) for numbers 0, -1, 1, and 2.

Exercise: What should a size method compute on an infinite tree? Don’t write any code for now (that’s coming) – just take a moment to think about what output you would expect if you computed the size of a tree with NodePromise nodes.

3 When to Force Node Creation

Now that we have force, we need to decide when to use it. Our goal is to construct the tree "on demand", as a user requests access to additional nodes. However, users should not have to call force. Instead, the user should have the illusion of having an infinite tree, leaving it up to our code to force creation of nodes when needed to maintain the illusion.

Notice that Node has fields for left and right, while NodePromise does not. If a user has a variable of type INode and wants the left subtree of that node, the user cannot simply write mynode.left. Instead, the INode interface needs methods getLeft() and getRight() that will get the corresponding subtree. For Node objects, these methods just return the corresponding (existing) node. For NodePromise objects, these methods force creation of the new nodes:

  interface INode {

    INode getLeft();

    INode getRight();

  }

  

  class Node extends AbsNode {

    private INode left;

    private INode right;

  

    // constructor omitted

  

    public INode getLeft() { return this.left; }

    public INode getRight() { return this.right; }

  }

  

  class NodePromise extends AbsNode {

    // constructor omitted

  

    private INode force() {

      return new Node(this.data,

                      new NodePromise(this.data - 1),

                      new NodePromise(this.data + 1));

    }

  

    public INode getLeft() { return this.force().getLeft(); }

    public INode getRight() { return this.force().getRight(); }

  }

Exercise: Add a getData method to nodes that returns the data at each node. To convince yourself that getLeft and getRight hide details from a user, experiment with the following expressions:

> NodePromise root = new NodePromise(0);

> root.getLeft().getData()

-1

> root.getLeft().getLeft().getData()

-2

> root.getLeft().getRight().getLeft().getData()

-1

> root.getLeft().getRight().getRight().getData()

1

Exercise: Just from the behavior of these examples (without looking at the code), how do you know that your force method isn’t accidentally creating the entire infinite tree?

4 Computing the Size

Here is a proposed method for computing the size of an infinite tree:

  abstract class AbsNode implements INode {

    int data;

  

    // constructor omitted

  

    public int getData() { return this.data; }

  

    public int size() {

      return 1 + getLeft().size() + getRight().size();

    }

  }

Exercise: Use this method to compute the size of one of your infinite trees. What happens and why?

Exercise: Add a size method to your infinite trees that actually terminates. What did you need to do to make that happen? Can you be sure that your program will terminate on all inputs? Write down a precise justification for why your program will always terminate.

5 Looking Ahead in the Tree

Exercise: One common operation on game trees is to look ahead a few steps to help determine which of several moves to take (assuming you had more data than just a number in each node). Write a lookAhead function on INode that takes an integer indicating how many levels of concrete tree are required, and returns a list of the possible nodes at that depth. (A real game program would then perform some computation over that list of nodes to select which move to make next.)

To get you started, here are the interface and examples:

  interface INode {

    int getData();

    INode getLeft();

    INode getRight();

    IList<INode> lookAhead (int depth);

    int size();

  }

Calling lookAhead on a node at depth 0 should return a list containing only the node (irregardless of whether that node is a NodePromise). At any depth larger than 0, lookAhead will simply append the results of calling lookAhead on each of its subtrees at depth-1. Use Java’s built-in lists for the result (the Java method corresponding to append is called addAll–it takes a list as an argument and adds all elements in that list to the list object on which you called addAll.

Exercise: Where does the lookAhead method body differ between Node and NodePromise? Be sure you have abstracted over any similarities to share common code between the two classes.

Exercise: Write a series of checkExpect tests that combine lookAhead and size to confirm that lookAhead is behaving properly.

6 From Trees to Graphs through Memoization

We noted earlier that our "tree" is really a graph, in that there are multiple ways to get from the root of the tree to a node with a given value. Each node we create with a specific number as its data has the same children. Shouldn’t we save resources and use only one copy of the node every time it arises in the tree?

The way to do this is through a technique called memoization. Every time a memoized function is called, the function stores the input it was given along with the output it has computed. The next time the same function is called with the same input(s), the function simply returns the previously-computed answer rather than compute it afresh.

Hashtables are a good data structure to use in memoization. For creating infinite trees, we want a hashtable in which the keys are the data (numbers) in nodes and the values are the corresponding Node (not NodePromise) objects. The force method uses the hashtable: if a forced node already exists in the hashtable, force returns it; otherwise, force creates the new Node and stores it in the hashtable before returning it.

Exercise: Add a hashtable as described above, so that there is only one unique Node for each data number. Part of the challenge is figuring out where to put the hashtable (you need just one for the whole data structure). If you can’t figure that out, work ahead to the next section then come back and finish this up.

Exercise: Figure out how to test that your code really does produce only one unique Node per number.

7 Encapsulating Infinite Trees

When we discussed encapsulation, we talked about the importance of providing operations that access a data type without revealing the internals of that data structure. In that spirit, we want to create a class called GameTree which contains a field of type INode (for the current tree).

The constructor for GameTree should take the number to use at the root of the tree. The constructor should set its INode field to a new NodePromise with the given number as its value.

Exercise: Encapsulate infinite trees in a GameTree class as described here.

Exercise: Provide a second constructor for GameTree that takes no arguments, and initializes the INode field to a NodePromise with value 0. Having multiple constructors can provide flexibility when different contexts would use your program in different ways.

8 Summary

What you’ve seen here is a powerful programming concept known as laziness. In lazy computation, we compute only what is needed, extending the data (using force) if more cases are required. If you think about it, force is a rather different function than ones you have seen before. Most functions simply provide names for computations. When we give them arguments, they substitute those arguments into the function body and return a result. In contrast, force exists to delay a computation (namely, the construction of the tree). It doesn’t take arguments. Those of you who came through 1102 saw this concept before through lambda. The core idea is important in computer science overall, however: functions do not always abstract over behavior; they also can set up a computation to be performed later.

The main idea in this lab is that we can support infinite data by defining classes and methods to build more data lazily, as it is needed. This is a powerful tool that applies to many programming contexts.