Removing Elements from BSTs
1 Implementing rem  Elt with BSTs
2 Rewriting Code to Eliminate instanceof
3 Casting:   Making the Types Work Out
4 Other Notables in the Java BST Implementation

Removing Elements from BSTs

These are optional notes on how to implement removal of elements from BSTs in Java. There are two versions here – one that uses some constructs we haven’t covered yet (but are a bit of a cheat in that using them here isn’t pure OO programming), and one that is a bit more complex but uses pure OO programming. The pure-OO one may be a bit confusing if you are still getting the hang of Java.

None of this material will be tested on exams. This is purely for your interest/education.

1 Implementing remElt with BSTs

First, let’s turn the general description of the remElt algorithm into Java code. For simplicity as we look at the subtleties of Java, we will always grab the largest element in the left child when we need to remove the root of a tree with two populated subtrees. The parts that raise interesting Java points are written in all capital letters between angle brackets (these are not valid Java code).

  public IBST remElt (int elt) {

    if (elt == this.data) {

      if <BOTH CHILDREN ARE MtBSTs> {

        return new MtBST();

      } else if <LEFT IS AN MtBST> {

        return this.right;

      } else if <RIGHT IS AN MtBST> {

        return this.left;

      } else { // both children are DataBSTs

        return new DataBST(this.left.largestElt(),

                           this.left.remElt(this.left.largestElt()),

                           this.right);

      }

    } else if (elt < this.data) {

      return new DataBST(this.data,

                         this.left.remElt(elt),

                         this.right);

    } else { // elt > this.data

      return new DataBST(this.data,

                         this.left,

                         this.right.remElt(elt)) ;

    }

  }

Before you go on: make sure you see that this code implements the BST remElt algorithm. You should be able to articulate why this code preserves the BST invariant.

Now we need to capture the all-caps test questions in Java. To write these tests, we need a way to determine whether each child tree is an MtBST or a DataBST. Understanding how to do this properly is the point of this section of the presentation.

If you have had Java before, you may have been taught that you can check whether an object was created from a given class using an operator called instanceof. Using instanceof, we would fill in the holes as follows:

  if (this.left instanceof MtBST && this.right instanceof MtBST) {

    return new MtBST();

  } else if (this.left instanceof MtBST) {

    return this.right;

  } else if (this.right instanceof MtBST) {

    return this.left;

  } else { ... }

Back when we showed how to migrate Racket programs over mixed data to Java, however, we discussed that good OO programs should not check the type of objects explicitly. Remember that one of the key points of OO languages is that they handle finding the right method based on the type of an object automatically (this is called dispatch). So while instanceof works here, it isn’t a proper solution in an OO language.

2 Rewriting Code to Eliminate instanceof

A proper OO solution requires that we capture the effect of the instanceof uses in methods; these methods will have different implementations on each of the MtBST class and the DataBST class that achieve the effects of the original instanceOf. Our goal then is to design a method that can dispatch on the children to perform the appropriate computation.

To help with that, let’s reorganize the conditional tests around the types of the children. We start with a conditional based on the type of the left child:

  if (this.left instanceof MtBST) {

    if (this.right instanceof MtBST) {

       return new MtBST();

    } else {

       return this.right;

    }

  } else { // left is a DataBST

    if (this.right instanceof MtBST) {

       return this.left;

    } else { ... }

  }

Convince yourself that this version is indeed equivalent to the first version we sketched out. Now, note that in the case that the left is an MtBST, we return the right child in either case. So we can further simplify this to:

  if (this.left instanceof MtBST) {

    return this.right;

  } else { // left is a DataBST

    if (this.right instanceof MtBST) {

       return this.left;

    } else { ... }

  }

Next, we turn this into a method on IBST that we will call on the left child: the answer for the if will be the body of the method in the MtBST class, and the answer for the else will be the body of the method in the DataBST class (just as we did when writing methods on animals in week 1). Let’s call the method remParent:

  // goes into the MtBST class

  IBST remParent(IBST rightSibling) {

    return rightSibling;

  }

  

  // goes into the DataBST class.  "this" is the left sibling

  IBST remParent(IBST rightSibling) {

    if (rightSibling instanceof MtBST) {

       return this;

    } else { ... }

  }

We would call this method from within remElt in the DataBST class, as follows:

  // remElt in the DataBST class

  public IBST remElt (int elt) {

    if (elt == this.data)

      return this.left.remParent(this.right);

    else if (elt < this.data)

      ... //code is the same after here

  }

We still have a use of instanceof in the body of remParent, but we can use the same technique. The conditional already branches on the type of a single object, so we simply introduce a new method to handle the dispatch. We will call the new method mergeToRemParent. It will be called on the right sibling, taking the left sibling as an argument:

  // goes into the MtBST class

  // "this" is the right sibling; leftsibling is a DataBST

  IBST mergeToRemoveParent(IBST leftsibling) {

    return leftsibling;

  }

  

  // goes into the DataBST class.

  // "this" is the right sibling; leftsibling is a DataBST

  IBST mergeToRemoveParent(IBST leftSibling) {

    // this is where we choose largest-in-left or smallest-in-right,

    //   branching accordingly.  Only showing largest-in-left here

    int newRoot = leftSibling.largestElt();

    return new DataBST(newRoot,

                       leftSibling.remElt(newRoot),

                       this);

  }

In the implementation of mergeToRemoveParent for the DataBST class, we have filled in the ellipses that we have carried through the example, refining the terms to match the variable names in the method.

To make this code compile, we also need to add both remParent and mergeToRemoveParent to the IBST interface, and accordingly mark all implementations of these methods as public. The full solution shows all of these details.

NOTE: There are other approaches you might take to eliminating instanceof in this code. If you want to explore them, work on the advanced option in lab this week.

3 Casting: Making the Types Work Out

Unfortunately, the code as we have it still won’t compile due to one last subtle issue. Remember that we are using BSTs to implement sets. We now have the following interfaces for Iset and IBST, and the following concrete types in DataBST:

  interface Iset {

    Iset addElt (int elt);

    Iset remElt (int elt);

    int size ();

    boolean hasElt (int elt);

  }

  

  interface IBST extends Iset {

    int largestElt();

    IBST remParent(IBST sibling);

    IBST mergeToRemoveParent(IBST sibling);

  }

  

  class DataBST implements IBST  {

    ...

    DataBST(int data, IBST left, IBST right) {

      this.data = data;

      this.left = left;

      this.right = right;

    }

  }

Now, look closely at the types of objects we are passing to the DataBST constructor within remElt in the DataBST class:

  public IBST remElt (int elt) {

    if (elt == this.data)

      return this.left.remParent(this.right);

    else if (elt < this.data)

      return new DataBST(this.data,

                         this.left.remElt(elt),

                         this.right);

    ...

  }

The second argument to DataBST here is the result of remElt. The interfaces indicate that remElt returns an object of type Iset. But the DataBST constructor expects the second input to be of type IBST. The Java compiler will reject this code on a type mismatch.

But wait – we know that we are implementing Iset through IBST in this program. The actual remElt method we are calling returns an IBST. Aren’t we then guaranteed that the types are fine when we run the code?

Yes, we are. However, the Java type system cannot confirm this automatically (designing type systems in the presence of inheritence is very tricky, precisely for cases such as this). The Java compiler has no choice but to reject this code. The Java language, then, needs to provide programmers with a way to take the responsibility for this code executing properly in practice.

Java programmers do this by claiming what type the result of remElt will have a run time. This claim is called a cast. It is written as follows:

  public IBST remElt (int elt) {

    if (elt == this.data)

      return this.left.remParent(this.right);

    else if (elt < this.data)

      return new DataBST(this.data,

                         (IBST) this.left.remElt(elt),

                         this.right);

    ...

  }

The IBST before the result of remElt tells the compiler "assume this object is an IBST when you compile". The run-time system, in turn, will check this claim when the program is actually running. If the actual object does not implement IBST, an error will be reported as the program runs.

Once you have hierarchies of classes and interfaces, casts are sometimes necessary to make code compile. They slightly hurt the performance of running programs (since the types are checked at run-time rather than compile time). As a Java programmer, you should be careful to only use a cast when you are confident that the objects you are casting can actually be of the indicated type.

4 Other Notables in the Java BST Implementation

Several other details are embedded in the full BST implementation. In particular: