Perspective and Wrap Up
We’ve covered a lot of material this term. Let’s try to put it in context.
1 Datatypes
1101/2 2102 |
--------------------------------------- |
Compounds structs classes (core) |
Lists X X (core) |
Trees X X (core) |
BSTs X |
Balanced BSTs X |
Heaps X |
Graphs X (core) |
Hashtables X |
We’ve annotated "core" data structures, each of which supports fundementally different shapes of data.
2102 Themes:
A new core datastructure, graphs, which you could not handle coming out of 1101. Graphs allow cyclic data.
Data structures that support the same basic shapes as core datastructures, but provide better performance on certain operations. Could be replaced with core datastructures, but at a computational cost. Examples: BSTs, Balanced BSTs, Heaps, Hashtables
Using invariants to define new data structures through constraints on core data structures (ie, capturing "balanced" on search trees.
2 Types and Interfaces
We introduced Java interfaces as a way to write down types (since Java requires types), but also showed them as a powerful way to separate implementation from specification. When you are writing code for others to use, give them an interface. Maintain control of the implementation. This lets you maintain the code, or change the implementation over time.
3 Controlling Data Access
1101/2 didn’t cover this material at all. We discussed public/private/etc access modifiers, as well as abstract and static, to constrain how data may be disseminated, created, and used.
4 Abstracting and Sharing Code
Sharing code for common computation helps you create clean code that is easier to maintain. It also helps you define libraries for other people to use (you implement the core data structure, and let people operate over your data structure through pre-defined traversal operations).
1101/2 2102 |
-------------------------------------------------------- |
over data helper function helper method |
-------------------------------------------------------- |
over function function as arg class containing method |
on atomic data to pass as arg |
-------------------------------------------------------- |
over function on function as arg visitor |
data with variants |
-------------------------------------------------------- |
over types generics |
"Data with variants" here refers to any data that has multiple concrete classes for a particular data, such as Animals (having boas and dillos) or media items (having DVDs and books).
2102 Themes:
Because Java requires all data to lie in classes, we cannot simply pass a function as an argument. We have to wrap it up in an object and pass the object instead.
Because OO distributes functions over data (unlike Racket, which kept code for all variants together), need a more complicated technique to pass a function over data with variants.
5 Testing
1101/2 2102 |
------------------------------------------------------- |
Designing test cases X X |
Testing for correctness X X |
Testing functions with write test methods |
multiple right answers |
Testing large data write test methods |
Testing for invariants use test harness |
Testing under mutation setup and teardown |
2102 Themes:
While test cases still need to state both inputs and expected answers, it is infeasible to write concrete expected answers for problems with large or multiple correct outputs. In these cases, we should write methods that check for properties of the returned data.
When we have to check multiple properties of data (as when we test for invariants), it is better to write a set of methods, rather than just one method, to check the output. Test harnesses organize a set of test methods and run them on each given output.
6 Algorithmic Techniques
1101/2 2102 |
------------------------------------------------------- |
Terminating on cyclic data X |
Memoization X |
2102 Themes:
Once your data can be cyclic, programs might not terminate. We need to maintain additional data to write programs that terminate. We need to be able to argue why we use and maintain that data sufficiently to guarantee termination.
On expensive computations, it sometimes make sense to cache previous results and reuse them later. Memoization is a general technique for doing this. It applies to individual functions.
7 Working with Mutation
Our recent segment on mutation was designed to show you how using mutation puts additional responsibility on programmers to argue for termination, not destroy infrastructure during testing, and backtrack computations properly. The goal was not to say that mutation is evil, but rather that it is a fairly advanced topic.