Due: Monday, Feburary 20, 11:59pm via turnin (assignment
name hwk4)
Collaboration Policy:
Pairs Permitted (but see the warning two paragraphs down)
In this assignment, you will implement two garbage collectors: mark & sweep, and stop & copy. As we have seen, garbage collection involves starting from a root set of references, which reside in local variables and stack frames, and searching for reachable values on the heap. We are providing support code that gives your collector access to a running Scheme program's root set. You just have to implement the memory allocator and the garbage-collection algorithms.
Note that even though this is a pairs assignment, we do not want to see pairs that each write one collector and turn them in under both names. The contrasts behind the collectors are the point of this exercise. If the styles of your two collectors are sufficiently different so as to suggest that both partners didn't collaborate on both collectors, we reserve the right to deduct substantial points from each of you.
For this assignment, you will no longer be using the PLAI language
levels. Instead, set the language level to (module ...).
Download the support code (as tar.gz or zip) for the garbage collector. This includes several files:
require this file into your allocator.
It defines the following abstractions:
(get-root-set id ...) (SYNTAX) : produces a list
representing the current set of roots, including an entry for each
additional id you provide (explained below) (root-name root) : given a root, returns the
name of the root as a symbol. If the root corresponds to a program
variable, it will have the same name. Otherwise, it will be
tmpXXX . This procedure is only provided to help you debug
your collector.
(read-root root) : given a root, returns the
location to which it currently refers (set-root! root new-loc) : given a root and a
location, updates the root to refer to the new locationThe support code provides a trivial (sample) allocator that does not perform garbage collection (i.e., it just gives up after running out of memory). You will need to provide your own allocator modules—one for each collection algorithm—that implement the same interface. The interface is as follows:
gc:set-heap-size! : num -> () | This procedure should set the size of the heap to the given number. This is the first thing any mutator must do, so no allocation should be possible until this call is made. |
gc:alloc-flat : num U sym U bool U empty ->
loc | This procedure should allocate a flat Scheme value (number, symbol, boolean, or empty list) on the heap, returning its location (a number). The value should occupy a single heap cell, though you may use additional space to store a tag, etc. You are also welcome to pre-allocate common constants (e.g., the empty list). This procedure may need to perform a garbage-collection. If there is still insufficient space, it should raise an error. |
gc:number? : loc -> boolean | This procedure should return true iff the content of the given location is a number. |
gc:symbol? : loc -> boolean | This procedure should return true iff the content of the given location is a symbol. |
gc:boolean? : loc -> boolean | This procedure should return true iff the content of the given location is a boolean. |
gc:empty? : loc -> boolean | This procedure should return true iff the content of the given location is an empty list. |
gc:deref : loc -> num U sym U bool U
empty | Given the location of a flat Scheme value, this
procedure should return that value; if the location points instead to
a cons cell, this function should raise an error. Note
that this function may get called with the boolean value returned by a
recognizer, instead of a location. Just return the same boolean value
in that case.
|
gc:cons : loc loc -> loc | This
procedure should allocate a cons cell on the heap, whose
first and rest are the given arguments.
These fields must be stored in separate heap cells, and you may use
additional space for a tag, etc. As with alloc-flat,
this procedure may need to perform a garbage-collection, and if there
is still insufficient space it should raise an error.
|
gc:cons? : loc -> boolean |
This should return true iff the location refers to a cons cell
|
gc:first : loc -> loc | If the
given loc refers to a cons cell, this should
return the first field. Otherwise, it should raise an
error.
|
gc:rest : loc -> loc | If the
given loc refers to a cons cell, this should
return the rest field. Otherwise, it should raise an
error.
|
gc:loc->scheme-val : loc -> any |
For debugging: this procedure consumes a location and returns a Scheme value corresponding to the contents of the location. |
get-root-set, allocate its data on your allocator's heap,
and use the versions of primitives defined in your allocator. (When
the mutator refers to procedures like cons or
symbol?, the calls are automatically redirected to the
gc:cons and gc:symbol? defined in your
allocator.)
module defined in the make-mutator.ss
language. It declares a heap size of 100 cells, then defines and
applies some simple functions, printing out the results. You will
want to test against this module, probably with smaller heap sizes,
and also to write your own mutators. When you write your own mutator,
make sure the first thing you do is to set the heap size by calling
(heap-size size).
Allocator.ss and sample-mutator.ss are examples of the files you are expected to submit for this assignment. The mutator file requires (loads) the allocator file (thus letting you control which collector gets used for a set of tests). The others are support files that the other two reference as shown in the examples. You do not need to understand how the support files work to do this assignment.
get-root-set to
obtain the current set of roots. In gc:alloc-flat, you
should not list any additional identifiers. However, in
gc:cons, you must provide the arguments (the first and
rest fields), since these are roots. (If you don't list them, you
will need to treat them specially and be sure to update them with
set!).
In the mark & sweep collector, you should maintain a free list in the heap. That is, you should not use any auxiliary data structure; instead, use the available space in the heap to keep track of the free list. You may use one extra box (“register”) to point to the start of the free list. You may need to adjust your allocation accordingly to have enough space to maintain free list pointers!
You will probably want to use numbers to represent locations, but
you may use another representation if it eases your implementation.
Also, to help see what is happening, you might want to print a
representation of the heap before and after every garbage collection.
You can use the Scheme printf function:
(printf "Heap is ~a~n" Heap)
Some final words of advice:
Submit three files: (1) mark-sweep.ss, (2) stop-copy.ss, and (3) tests.ss. The first two contain the allocator modules that implement the corresponding collection algorithms. The third contains the mutator module with your test cases for the collectors. Document each test with a comment describing what behavior the test is demonstrating. Your tests file can load either of your collectorss (doesn't matter which one at the time of submission), but we will assume that you ran the test file against each collector before submitting.
In general, you will get more partial credit for one working collector rather than two barely working collectors. If you run out of time to write both, focus on one.
get-root-set seems to return duplicate roots. Is
this a bug in the support code? No, this is the intended behavior.
Roots come from walking the stack, and several frames may refer to the
same variable. Moreover, different variables may refer to the same
heap location. You need to update all of them in your stop &
copy collector.
It would be a lot easier if I could just make all objects consume the same amount of space. Would that be acceptable? No! We've been extremely easy-going in only making you handle two sizes. To reduce it to one size would make the problem trivial and unrealistic.
Can we create two vectors for the two spaces in stop and
copy?
Yes, this is fine.