Interfaces and Abstract Datatypes

Kathi Fisler

We’ve covered the fundamental parts of Java programs: classes, abstract classes, and interfaces. Classes capture new concrete kinds of data. Abstract classes share fields, methods, and helper functions that are common to classes. Interfaces define types based on required methods, but omit the details of fields and how the methods are implemented.

So far, we have seen one use for interfaces: to create types for data that have choices, or variants (such as many specific animals, or different options of family tree). Interfaces have many other uses, though. The general idea of summarizing what something does (specification) rather than how it does it (implementation) is one of the most fundamental concepts in all of engineering. It underlies good programming in every language, as well as good software design at a large scale. Today, we consider another programming problem that interfaces help address.

1 Three Motivating Problems

Consider the following problems:

Store all of the URLs visited from a web browser, with the ability to check whether a specific URL has been visited.
Gather all the words that someone generated during a word game (such as "write down all words can you make from the letters d e h l l l o o r w"), with the ability to add words, find out how many words the player generated, and get all of the words for purposes of scoring.
Maintain information on who is coming to a dinner party. In addition to adding and removing names, you need to be able to compare the current number of attendees to the number of chairs in your apartment.

If we were to write classes for each of these problems, we might do something like the following:

  class visitedURLs {
    ...
    visitedURLs addURL(String newURL) {
      ...
    }

    // determine whether URL has already been visited
    boolean haveVisited(String aURL) {
      ...
    }
  }

  class Words {
    ...
    Words addWord(String newWord) {
      ...
    }

    int numWords() {
      ...
    }
  }

  class Guests {
    ...
    Guests addGuest(String name) {
      ...
    }

    boolean checkChairs(int numChairs) {
      ...
    }
  }

Hopefully, you would quickly realize that you were writing the same fundamental code several times. Hopefully, you’ve developed the instinct not to do that. So what should you do about it? So far, the answer you’ve learned might be "make an abstract class". But that doesn’t seem a good idea in this case. Abstract classes are used to share code across related data in the same domain. Different collections of strings aren’t related in the same way that different kinds of animals are. Abstract classes should only be used to share common code across classes for data that are meaningfully related in the real world.

In this case, you have really hit on the need for a data structure: a pattern of data with standard operations (such as adding members or counting members) that can be used in many contexts. In 1101/2, you studied three fundamental data structures: structures, lists, and trees. Up until now, you have focused on the different shapes of data: fixed components (structures) versus arbitrary components (lists), or single next elements (lists) versus multiple next elements (trees, with variations for fixed and arbitrary numbers of next elements). Data structures involve constraints beyond just the shape of data though. 2102 focuses on these additional constraints.

2 Classes as Data Structures

Let’s get concrete again. Our three motivating problems indicate that we need a data structure that supports several basic operations:

A notion of a collection of items
The ability to add and remove items to the collection
The ability to count items in the collection
The ability to check whether an item is in the collection

A data structure with these operations is called a Set. Ideally, we would like to have a type for sets, and then define each of visitedURLs, game answers, and guest list as a set.

You might consider making a Set class that has the general operations that you need. Something like:

  class Set {
    <Fields to hold the data>

    <Constructor>

    Set addElt(String newElt) { ... }
    Set remElt(String newElt) { ... }
    int size() { ... }
    boolean hasElt(String elt) { ... }
  }

Given this Set class, you could implement your desired classes along the following lines:

  class visitedURLs {
    Set theURLs;

    visitedURLs addURL(String newURL) {
      return this.theURLs.addElt(newURL);
    }

    // determine whether URL has already been visited
    boolean haveVisited(String aURL) {
      return this.theURLs.hasElt(aURL);
    }
  }

  class Guests {
    Set guests;

    Guests addGuest(String name) {
      return this.guests.addElt(name);
    }

    boolean checkChairs(int numChairs) {
      return numChairs >= this.guests.size();
    }
  }

Stop and think: is this a good solution? What do you like or dislike about it?

3 From Data Structures to Abstract Datatypes

Return to our three motivating problems again: they all require the same operations, but different operations matter more in these different applications:

The set of previously-visited URLs in a typical browser is huge. If the user wants to style or color visited URLs differently from unvisited ones (common), the browser needs to quickly search the set for each URL in the page it is rendering.
When scoring the word game, the ability to get the longest word a player found is important (as the long words are worth more points). Searching doesn’t happen often in this application.
The party list is likely small, so the performance of particular operations probably doesn’t matter as much.

Thus, while the collection of operations we need for each problem is similar, the detailed implementations that provide the best performance for each context could be rather different. A class, however, will fix one implementation. We would be much better off creating an interface for operations on sets, and let the author of each application choose the implementation that fits the current size of their data (note that this choice may change over time!).

3.1 An Interface for Sets

The following interface captures the required operations on sets:

  interface ISet {
    ISet addElt(String newElt) ;
    ISet remElt(String newElt) ;
    int size();
    boolean hasElt(String elt);
  }

Note that using the ISet interface, rather than the Set class has little impact on how we implement the classes for our motivating problems. We simply change the type of the field, as shown below for the visitedURLs example:

  class visitedURLs {
    ISet theURLs;  // the only change is in this line

    // Constructor omitted

    // mark a new URL as visited
    visitedURLs addURL(String newURL) {
      return this.theURLs.addElt(newURL);
    }

    // determine whether URL has already been visited
    boolean haveVisited(String aURL) {
      return this.theURLs.hasElt(aURL);
    }
  }

3.2 Abstract Datatypes, a First Definition

We have just introduced the distinction between a data structure and an abstract datatype. A data structure is a specific set of fields and algorithms for manipulating data in those fields to implement desired operations. An abstract datatype (abbreviated ADT throughout computer science) is a collection of operations (with the types of their inputs and outputs) but without fixing implementations. The term "abstract" reflects the lack of concrete implementations.

Even though you may only have a vague sense of abstract datatypes at this point, you should have an initial sense that data structures are implemented with concrete details (in Java, this means in classes), while abstract datatypes simply summarize names and types of operations (in Java, this means interfaces). The full story is more subtle, but you should have at least that intuition for now.

The structures/classes, lists, and trees that you learned about in 1101 are all data structures. So far, the only abstract datatype we’ve seen is for sets.

3.3 Different Flavors of Similar Datatypes

Now, let’s consider another motivating problem: tracking votes. We want to determine both how many votes were cast and how many votes were cast for each candidate. Votes are also just collections of words, so we should be able to represent votes as an ISet as well.

Stop and think: what should the result of the following expression be for each of the party guests and votes problems (assuming in one case Elmo has been invited to the party and in the other we are voting for Elmo)?

ISet s = new Set();
s.addElt("Elmo").addElt("Elmo").size()

We want this expression to yield different answers in these two problems. For votes, we need to record each addition of "Elmo" as a distinct element in the set. For counting party guests, we only want to count Elmo once.

This suggests that there is more to data types than just the operations they support. We also need to know something about other properties of the datatype, as reflected in interactions between the operations.

3.4 Properties and Axioms on Datatypes

Our examples (including votes) point to two different datatypes:

Sets, which are collections of elements with no duplicates.
Bags, which are collections of elements that include duplicates.

We need to build this distinction into the datatypes. But how?

One way is with an informal comment such as the one given above: we simply summarize properties of the datatype in prose.

In addition, however, we should also be clear about the impact of the properties on the datatype’s operations (such as whether adding an element should change the number of elements in the data structure). We do this by including logical statements about the interaction of operations under the properties; these statements are called axioms.

3.5 Abstract Datatypes, a More Complete Definition

The following shows the full ADT description for Sets:

-------------------------------------------
Name: Set
Operations:
  addElt : Set element -> Set
  remElt : Set element -> Set
  size : Set -> integer
  hasElt : Set element -> boolean
Properties:
  - The set has no duplicates
  - The items within the set are unordered
Axioms
  // on hasElt and addElt
  - if hasElt(S,e) then addElt(S,e)) = S
  - hasElt(addElt(S,e),e) = true
  - if hasElt(S,e) and e!=f then hasElt(addElt(S,f),e)

  // on hasElt and remElt
  - if (not hasElt(S,e)) then remElt(S,e) = S
  - hasElt(remElt(S,e),e) = false
  - if hasElt(S,e) and e!=f then hasElt(remElt(S,f),e)

  // on addElt/remElt and size
  - if (not hasElt(S,e)) then size(addElt(S,e)) = size(S) + 1
  - if hasElt(S,e) then size(remElt(S,e)) = size(S) - 1

  // on addElt and remElt
  - remElt(addElt(S,e),e) = S
-------------------------------------------

The axioms are grouped by the operations whose interactions they capture. The order of axioms is not relevant. While axioms may look like test cases, they are not. Test cases are over concrete data, while axioms have variables. Axioms are, however, useful blueprints for test cases.

Keep in mind that ADT descriptions are just documentation, not code. We write properties and axioms in a general form that can be adapted to working in any programming language.

3.5.1 The Relationship between Properties and Axioms

Axioms capture how operations reflect properties of a datatype. They do not necessarily capture all properties though. Consider the property that elements are unordered – we don’t capture that anywhere in the axioms. We can’t, because the axioms are written against the operations and the operations don’t say anything about order. That’s worth repeating more directly: some properties are implicit in the operations provided in the ADT. You cannot capture such properties in axioms.

3.6 The ADT for Bags

For reference, here is the ADT for Bags. Bags lose the duplicates property; the axioms change accordingly. Note that this ADT for bags assume that remove takes out 1 occurrence of the element, not all occurrences.

-------------------------------------------
Name: Bag
Operations:
  addElt : Bag element -> Bag
  remElt : Bag element -> Bag
  size : Bag -> integer
  hasElt : Bag element -> boolean
Properties:
  - The items within the bag are unordered
Axioms
  // on hasElt and addElt
  - hasElt(addElt(S,e),e) = true
  - if hasElt(S,e) and e!=f then hasElt(addElt(S,f),e)

  // on hasElt and remElt
  - if (not hasElt(S,e)) then remElt(S,e) = S
  - if hasElt(S,e) and e!=f then hasElt(remElt(S,f),e)

  // on addElt/remElt and size
  - size(addElt(S,e)) = size(S) + 1
  - size(remElt(S,e)) = size(S) - 1

  // on addElt and remElt
  - remElt(addElt(S,e),e) = S

  // on addElt and remElt and hasElt
  - hasElt(remElt(addElt(addElt(S,e)),e),e) = true
-------------------------------------------

4 Where Do the Pieces of ADTs go in Java?

ADTs are independent of programming language. When you program in a new language, you need to figure out how that language captures the pieces of an ADT. Java interfaces capture the name and operations of an ADT. The properties are not put into code: they go in comments or other forms of documentation. The axioms get built into the code that implements the ADT and into test cases.

Some languages provide more support for building axioms into programs than Java (just as Java provides more support for interfaces than Racket does). It is important to understand what facilities each language you work in provides for capturing these concepts. Whatever your language doesn’t capture explicitly becomes your responsibility to document and test extensively.

5 Summary

This lecture shifted us from learning Java constructs into more abstract computer science concepts. In particular, we saw:

Separating specification from implementation is one of the most fundamental concepts in engineering: there are many specific approaches to a problem, each meeting different constraints. In Computer Science, typical constraints are performance, power, or memory usage.
When we look at a programming problem, we need to think about the data we need, the operations we need to perform on that data, which operations need optimal performance for our context, and what detailed behavior operations must provide.
Java interfaces capture operations and their types. Java classes capture different implementations of operations (which implies different performance of the operations). Abstract datatypes capture required behavior on operations.
Two ADTs can have the same operations, but different behavioral requirements. The Java interfaces for these two ADTs may appear the same, but their documentation should differ. Sets and Bags are a good example of this.
To identify the axioms for an ADT, (1) consider each pair of operations and ask how they must interact to satisfy the properties; (2) for each property, think about how you might see it reflected in interactions between operations.
Data structures provide a specific implementation of an abstract data type). We will see much more of this concept in the coming lectures.

1	Three Motivating Problems
2	Classes as Data Structures
3	From Data Structures to Abstract Datatypes
4	Where Do the Pieces of ADTs go in Java?
5	Summary