Lecture 25 :Accumulating data

We have been writing list programs such that we always augment the answer
that comes back from the recursion on the rest of the list to get our
final answer.  There is another style of writing recursive programs in 
which we "accumulate" the final result in an input, then return the
accumulated input when we reach the end of the list.  

For example, instead of writing

;; sum : list-of-number -> number
;; produces sum of numbers in the list
(define (sum alon)
  (cond [(empty? alon) 0]
        [(cons? alon) (+ (first alon) (sum (rest alon)))]))

We could write

;; sum-accum : list-of-number number -> number
;; produces sum of numbers in the list
(define (sum-accum alon total)
  (cond [(empty? alon) total]
        [(cons? alon) (sum-accum (rest alon) (+ (first alon) total))]))

To use sum-accum, we should initially send 0 as the value for total.
Ideally, we should create a helper that does that for us (so we match
the contract of the original sum).

;; sum2 : list-of-number -> number
;; produces sum of numbers in the list
(define (sum2 alon)
  (sum-accum alon 0))

Notice that all of the code from the original sum is still here, just
shuffled around a little bit.  The main changes are the answer in the
empty case and that we augment the accumulation input (here total)
instead of the result of the recursive call on the rest.

Why would we want to do this?  It can be more efficient (we'll see an
example shortly), and it matches a style of programming that is common 
in other languages (for those of you planning to take another CS
course). 

Let's convert another example from earlier in the term to accumulator
style

;; long-words : list-of-string -> list-of-string
;; produce list of strings from input list containing at least 5 characters
(define (long-words alos)
  (long-words-accum alos empty))

;; long-words-accum : list-of-string list-of-string -> list-of-string
;; accumulate long strings from first list in second list
(define (long-words-accum alos list-of-long)
  (cond [(empty? alos) list-of-long]
        [(cons? alos) 
         (cond [(> (string-length (first alos)) 5)
	        (long-words-accum (rest alos) (cons (first alos) list-of-long))]
               [else (long-words-accum (rest alos) list-of-long)])]))

If we run this, we see that the words come out in the reverse order
from where they were in the input list.  Why?  Because now we cons them on
BEFORE we process the rest of the list rather than AFTER (as we used
to do).  This is sometimes useful.

Write an accumulator-style function reverse that consumes a list of
strings and reverses it.

;; reverse : list-of-string -> list-of-string
;; produce list of string with strings in opposite order from given list
(define (reverse alos)
  (rev-accum alos empty))

;; rev-accum : list-of-string list-of-string -> list-of-string
;; accumulates reverse of first list in second list
(define (rev-accum alos revalos)
  (cond [(empty? alos) revalos]
        [(cons? alos) (rev-accum (rest alos) (cons (first alos) revalos))]))

How would we have written this in non-accumulator style [those who
finish the accumulator version early can work on this one]

;; reverse2 : list-of-string -> list-of-string
;; produce list of string with strings in opposite order from given list
(define (reverse2 alos)
  (cond [(empty? alos) empty]
        [(cons? alos) (append (reverse (rest alos))
                              (list (first alos)))]))

Both versions give the same answer, but reverse2 is much more
expensive than reverse.  Why?  Because reverse "visits" each word
once, putting it onto the accumulator.  In contrast, reverse2 visits
each word once (as first), but also visits each word from the rest
once for each first on the list (because append has to walk to the end
of the list to drop the first in).  In other words, if we counted
stepper steps, the accumulator version would take roughly as many big
steps as the length of the list, while the other version would need
the square of the length of the list big steps.  This is a significant
difference computationally (n steps versus n-squared steps) -- if you
used a large enough list, you'd actually see the different in speed
between these two versions.

What have we seen about accumulator programs?

- They consist of two functions: a simple main function that calls a
  recursive helper.  The recursive helper takes an extra input, which is 
  the answer computed so far.

- The empty? case now returns the computed answer

- The initial value for the answer accumulator input is the answer in
  the base case of our old-style functions.

------------------------------------------------------------------

As a final exercise on this, let's rewrite our count-accessible
program on web pages in accumulator style.  Start with a program that
accumulates the urls that have been visited:

;; count-accumulate : webpage list-of-string -> list-of-string
;; produces list of URLS reachable from given page
(define (count-accumulate apage visited)
  (cond [(in-list? (webpage-url apage) visited) visited]
        [else (count-accumulate-links 
		(webpage-links apage) (cons (webpage-url apage) visited))]))

;; count-accumulate-links : list-of-webpage list-of-string -> list-of-string
;; produces list of URLS reachable from the pages in a list
(define (count-accumulate-links alowp visited)
  (cond [(empty? alowp) visited]
        [(cons? alowp) 
	 (count-accumulate-links 
           (rest alowp) (count-accumulate (first alowp) visited))]))

Finally, let the helper determine the number accessible from the
accumulated list of URLs:

;; count-accessible2 : webpage -> number
;; produces number of accessible web pages from given page
(define (count-accessible2 apage)
  (length (count-accumulate apage empty)))

This is more challenging than what you'd be asked to do on the exam,
but it is a good exercise to see if you understand how accumulation
works on tree-like data.