Accumulators on Trees

Let's look at another example of accumulators on trees.  Recall the
binary number trees from the exam.  Let's write a program to sum up
the numbers in a numtree.  As usual, we start with the structural
solution:

;; sumtree : bintree -> number
;; sum the numbers in a binary tree
(define (sumtree abt)
  (cond [(empty? abt) 0]
	[else (+ (bintree-num abt)
		 (sumtree (bintree-left abt))
		 (sumtree (bintree-right abt)))]))

We write the accumulator template as usual:

(define (sumtree abt)
  (local [;; accum : ...
	  (define (sum-accum atree accum)
	    (cond [(empty? atree) ...]
		  [else (sum-accum ...
			 (bintree-left atree) ...
			 (bintree-num atree) ...
			 accum ...)
			(sum-accum ...
			 (bintree-right atree) ...
			 (bintree-num atree) ...
			 accum ...)]))]
    (sum-accum abt ...)))

What should we use as the accumulator invariant?  Let's let the
accumulator represent the sum of the numbers in the part of the tree
that we've already visited.  Parts of this are easy to fill in:

(define (sumtree abt)
  (local [;; accum : sum of numbers in part of abt that sum-accum
	  ;;   has already visited
	  (define (sum-accum atree accum)
	    (cond [(empty? atree) accum]
		  [else (sum-accum ...
			 (bintree-left atree) ...
			 (bintree-num atree) ...
			 accum ...)
			(sum-accum ...
			 (bintree-right atree) ...
			 (bintree-num atree) ...
			 accum ...)]))]
    (sum-accum abt 0)))

What about the rest?  Notice that we have two trees, not one, to
process in the recursive call.  We could process each tree separately
and add the results.  That defeats the purpose of the accumulator,
though, because it still leaves a pending computation (the addition)
after the recursive calls.  We really want to pass the accumulated sum
from one subtree to the other, as follows:

(define (sumtree abt)
  (local [;; accum : sum of numbers in part of abt that sum-accum
	  ;;   has already visited
	  (define (sum-accum atree accum)
	    (cond [(empty? atree) accum]
		  [else (sum-accum 
			 (bintree-left atree) 
			 (sum-accum (bintree-right atree) 
				    (+ (bintree-num atree) 
				       accum)))]))]
	 (sum-accum abt 0)))

This style, where we pass the result of calling the accumulator on one
branch of the tree to the call to process the other branch of the
tree, is the standard method for writing accumulator programs over
trees.  When you write an accumulator program over trees, you should
expect to pass the result of processing one subtree to the result of
processing the other subtree(s).  This style of programming is
sometimes called threading.

Let's look at one more program on binary trees.  We want to gather a
list of all elements in a binary tree that are larger than 5.  The
structural program looks as follows:

;; get-larger : binary-tree -> (listof number)
;; returns list of all numbers in tree that are larger than 5
(define (get-larger abt)
  (cond [(empty? abt) empty]
	[else
	 (local [(define inrest (append (get-larger (bintree-left abt))
					(get-larger (bintree-right abt))))]
		(cond [(> (bintree-num abt) 5)
		       (cons (bintree-num abt) inrest)]
		      [else inrest]))]))

This time, however, let's accumulate something other than the list of
numbers in other subtrees that are larger than 5.  Recall that
accumulators carry knowledge about how a program executes to other
calls to that program.  In previous examples, we've accumulated the
output of a program.  This time, let's accumulate the other trees that
are yet to be processed.  This demonstrates another way that we might
use an accumulator in a program.  As usual, we start with the template.

(define (get-larger abt)
  (local [;;accum : list of subtrees of abt remaining to be processed
	  (define (get-larger-accum atree accum)
	    (cond [(empty? atree) ...]
		  [else
		   (local ((define inrest
			     (get-larger-accum
			      (bintree-left atree) ...
			      (bintree-right atree) ...
			      (bintree-num atree) ... accum)))
		     ... inrest ...)]))]
    (get-larger-accum abt empty)))

How do we fill in the program?  The else case is fairly easy -- it
follows the same strategy as in the structural program.  The empty
case here is harder.  Once the tree is empty, we begin to process the
accumulated trees that have not yet been processed.  The following
code shows the final program:

(define (get-larger abt)
  (local [;;accum : list of subtrees of abt remaining to be processed
	  (define (get-larger-accum atree accum)
	    (cond [(empty? atree)
		   (cond [(empty? accum) empty]
			 [else (get-larger-accum
				(first accum) (rest accum))])]
		  [else
		   (local ((define inrest
			     (get-larger-accum
			      (bintree-left atree)
			      (cons (bintree-right atree) accum))))
			  (cond [(> (bintree-num atree) 5)
				 (cons (bintree-num atree) inrest)]
				[else inrest]))]))]
    (get-larger-accum abt empty)))

We show you this to emphasize a point: accumulators do not always
gather partial computations of the result of a program.  Accumulators
simply pass knowledge about one computation to another computation.
The challenge in writing accumulator-style programs is figuring out
what the accumulator represents, and what invariant holds of that
accumulator.  Developing that invariant is the key to writing good
accumulator style programs.

Let's summarize.  We've now seen three different uses for an
accumulator:

 1. To introduce knowledge into our programs so that they can work
    properly (find-route).
 2. To make a program more efficient (reverse, available-days)
 3. To make programs easier to understand (more subjective)

Of these reasons, 1 is the most important to you right now.  Reason 2
is semi-important, because you can notice the efficiency problems on
small to medium sized examples.  Reason 3 is somewhat a matter of
personal taste.  Sometimes writing accumulator stype programs is easy
(as for sum).  Sometimes, as in get-larger, its a bit harder.  What
you should take from these lectures on accumulators is that writing
accumulator programs is easier if you write the structural solution
first, then follow the recipes to add accumulators carefully to your
programs.

Finally, let's look at one more problem on file systems.  Most
operating systems let you create shortcuts (or links) between files,
so that one file can appear in multiple directories.  Let's write a
program to detect whether there are any shared files in our
directories.  We'll assume that a file is shared if its name appears
more than once in the directory hierarchy.

;; shared-files? : dir -> boolean
;; determine whether some file name appears more than once in a
;;  directory hierarchy
(define (shared-files? adir) ...)

As usual, we start with a template.  We know that the accumulator must
hold the files that we've already seen, so let's also write the
accumulator invariant down now.

(define (shared-files? adir) 
  (local [;accum1 : list of files seen so far
	  (define (shared-accum a-dir accum1)
	    ...
	    (share/list-accum (dir-dirs a-dir) ...
			      (dir-files a-dir) ... accum1))
	  ; accum2 : list of files seen so far
	  (define (share/list-accum alod accum2)
	    (cond [(empty? alod) ...]
		  [else
		   ... (shared-accum (first alod) accum2) 
		   ... (share/list-accum (rest alod) ...
					 (first alod) ... accum2)]))]
   (shared-accum adir ...)))

As a first attempt, we might try to fill in the program as follows:

(define (shared-files? adir) 
  (local [;accum1 : list of files seen so far
	  (define (shared-accum a-dir accum1)
	    (or (overlap? (dir-files a-dir) accum1)
		(share/list-accum (dir-dirs a-dir)
				  (append (dir-files a-dir) accum1))))
	  ; accum2 : list of files seen so far
	  (define (share/list-accum alod accum2)
	    (cond [(empty? alod) false]
		  [else
		   (or (shared-accum (first alod) accum2)
		       (share/list-accum (rest alod)
					 (append (dir-files (first alod))
						 accum2)))]))]
	 (shared-accum adir empty)))

However, this doesn't work, as evidenced by the following test case.
What happens is that when we traverse (first alod) in
share/list-accum, we don't accumulate the files seen in the whole
(first alod) tree on the recursive call; we only accumulate the files
in the individual directory (first alod).  Therefore this program
fails to work.

(define d2 (make-dir 'Home empty
		     (list (make-dir 'Papers empty
				     (list (make-dir 'Fall99 (list 'p1 'p4) empty)))
			   (make-dir 'Courses empty
				     (list (make-dir 'Huma101 (list 'p1) empty)
					   (make-dir 'Comp210 (list 'p2) empty))))))


What's the solution?  To write this program fully in
accumulator-style, we need to accumulate both the files we've seen AND
the directories remaining to process.  This way, we can accumulate
files while processing one subdirectory, then pass those along to the
next subdirectory.  We therefore need a program with two accumulators,
as shown below.

(define (shared-files? adir)
  (local [;accum-f1 : list of files seen so far
	  ;accum-t1 : list of trees left to process
	  (define (shared-accum a-dir accum-f1 accum-t1)
	    (or (overlap? (dir-files a-dir) accum-f1)
		(share/list-accum (dir-dirs a-dir)
				  (append (dir-files a-dir) accum-f1)
				  accum-t1)))
	  ;accum-f2 : list of files seen so far
	  ;accum-t2 : list of trees left to process
	  (define (share/list-accum alod accum-f2 accum-t2)
	    (cond [(empty? alod)
		   (cond [(empty? accum-t2) false]
			 [else (shared-accum (first accum-t2)
					     accum-f2
					     (rest accum-t2))])]
		  [else
		   (shared-accum (first alod) accum-f2
				 (append accum-t2 (rest alod)))]))]
   (shared-accum adir empty empty)))

Certainly, you could write this program without using accumulators
(this is a good practice problem for structural recursion on trees if
you need the practice).  We showed you this version as a challenge,
and to demonstrate that working with accumulators isn't always as easy
as what we've seen in class all week.