1 Java Classes and Operations on Hashtables
2 Using Hash Maps
3 Defining Equality on Keys
4 Summary

Hashtables

Kathi Fisler

Our bank from the last few lectures has grown to have hundreds of thousands of customers and accounts. The system has long since outgrown the LinkedList data structure for these data; even the AVL-tree based implementation has gotten too slow for looking up customers. Ideally, the bank needs data structures that provide even faster techniques for getting from account numbers to Account objects and from names to Customers.

This lecture is about hashtables, a data structure that can perform lookups in near-constant time. A hashtable maps keys to values without traversing a data structure over all the keys. The diagram at the top right of the Wikipedia entry on hashtables illustrates the concept nicely. Every hashtable has a fixed number of "buckets" to which it maps keys. If you have more buckets than actual keys, hashtables provide a perfect match between keys and data. If you have more keys than buckets, keys sometimes collide, costing the accuracy of the data retrieved from the table. We won’t talk about hashtables in depth here, as you’ll see them in detail in Algorithms.

We will study hashtables in the context of providing a better data structure for implementing the IAcctSet interface in our banking system.

1 Java Classes and Operations on Hashtables

Java provides two data structures for hashtables: one called Hashtable and one called HashMap. HashMap is generally preferred, unless you have to deal with threads and synchronization (not a topic for this course). Hashtable is a legacy data structure from earlier versions of Java. We will therefore use the HashMap classes in Java as our implementation of the general hashtable data structure.

Hashtables are built into Java. To use them, include the line

  import java.util.HashMap;

at the top of your file.

HashMap is a generic class, defined over type variables for the key (K) and value (V). The most useful HashMap<K,V> methods are:

For the full list of HashMap methods, see the Java HashMap documentation.

2 Using HashMaps

Here’s a simple AcctSetHM class that implements IAccountSet using a HashMap:

  class AcctSetHM implements IAccountSet {

    private HashMap<Integer,Account> accounts = new HashMap<Integer,Account>();

  

    public Account findByNum(int num) {

      return accounts.get(num);

    }

  }

While this has the right spirit, it fails to properly handle cases in which the given account number is not found. The above description of the get method indicated that it returns null if the requested key isn’t in the hashtable. A null value should not leak out of findByNum. Instead, findByNum should check for it, and throw an exception if necessary:

  class AcctSetHM implements IAccountSet {

    private HashMap<Integer,Account> accounts = new HashMap<Integer,Account>();

  

    public Account findByNum(int num) throws AcctNotFoundException {

      Account acct = accounts.get(num);

      if (null == acct)

        throw new AcctNotFoundException(num);

      return acct;

    }

  }

For an example of put, we need a method that adds new accounts or new customers, neither of which we have including in the banking system to date. Where should we put a method to add new customers?

The method in BankingService might appear as follows:

  class BankingService {

    ...

  

    // create a new customer with an initial password and add to the customer set

    public Customer newCustomer (String custname) {

      Customer newC = new Customer(custname, this.genInitPwd());

      Customer oldC = this.customers.add(newC);

      return newC;

    }

  

    // generates default passwords for new customers.  Would not

    // generate the same password for all customers in a real system

    // Note: this is your first example of a private method

    private int genInitPwd() {

      return 1234;

    }

  }

For this to work, we have to add the add method to the ICustSet interface and each of its implementing classes.

  interface ICustSet {

    Customer findByName(String name) throws CustNotFoundException;

    Customer add(Customer newC);

  }

  

  class CustSetHM implements ICustSet {

    private HashMap<String,Customer> customers = new HashMap<String,Customer>();

  

    public Customer add(Customer newC) {

      return this.customers.put(newC.getUniqueID(), newC);

    }

  }

As we did when we called get in findByNum, we should augment this method to check for the null return value and throw an exception if a customer with this name already exists.

3 Defining Equality on Keys

Hashmaps store and retrieve values based on the keys. For hashmaps to work, Java must be able to tell whether data of the key type for a hashmap are the same. When we use integers or Strings as keys, Java can easily determine whether two keys are equal. What if we instead want to use objects of some class that we defined as the keys? For example, we might choose to use a hashmap to retrieve the accounts affiliated with each customer, rather than store a list of those accounts inside the customer.

If you need to create a hashmap using some class that you defined as the key, you need to tell Java two things:

You do this by adding two methods to classes that you will use as keys: equals takes another object of the same type and returns a boolean if they should be considered the same object; hashcode returns an int that must be unique (to high probability) across all the objects in the class. If you are working in Eclipse, there is a menu item that helps you generate these methods based on essential fields in your class. Items 7 and 8 in Chapter 3 of Effective Java provide a method for writing your own hashcode methods (this one chapter is available online.

4 Summary

Hashtables and hashmaps are a particular data structure that implement a more general ADT called a dictionary or map. These ADTs are used to map from keys to values. Hashtables and hashmaps are particularly fast implementations that achieves performance by converting keys to positive integers that are largely unique across the set of hashed items. Hashmaps themselves use these positive integers to directly access the locations where data is stored in memory (rather than having to search through memory for values).