CS 2135: Programming Language Concepts
Notes on Delayed Substitution

This page summarizes the lecture on delayed substitution.

Motivation

Our original interpreter (eval) handles function calls by substituting arguments for parameters in the body of the function. Consider the following Curly expression:

{{proc {x}
   {{proc {y}
      {{proc {z}
         {return x + y + z}}
       3}}
    4}}
 5}

How many times will subst traverse {return x + y + z} while evaluating this expression? Three -- once for each function call. How many times will {return x + y + z} be evaluated? Once -- when the final substitution (for z) is finished.

This suggests an inefficiency in our interpreter: we substitute more often than we evaluate. We could improve on this situation by delaying substitutions until a variable is actually encountered during evaluation. Implementing this requires an additional data structure that associates variables with their values:

(define-struct dsub (var value))

It also requires that we change the contract on eval to take a list of substitutions as an argument.

eval : expr list[dsub] -> value

A list of dsubs is called an environment.

Handing dsubs requires two changes to eval: the var? case must lookup values in the environment, and somewhere we must add new dsubs to the environment. The original interpreter code performed subst in the apply? case, so it makes sense that we would create new dsubs in this case. Specifically, the apply? case must add a new dsub to the environment before processing the body.

Making these two changes, but leaving the rest of the interpreter alone yields our first version of a dsub interpreter.

This new interpreter should yield the same answers as running the original interpreter. Consider the following expression:

(make-apply
 (make-proc 'x (make-apply
		(make-proc 'f (make-apply
			       (make-proc 'x (make-apply (make-var 'f) 10))
			       5))
		(make-proc 'y (make-plus (make-var 'x) (make-var 'y)))))
 3)

The interpreter with subst yields 13, but the new interpreter returns 15. What's wrong?

The dsub interpreter looks up variable values in the current environment. This means that when a variable is encountered, its value is taken to be the mos recent value for that variable. But this is dynamic scoping! Our interpreter is supposed to implement static scoping. Somehow, we've changed the scoping rules with the use of delayed substitutions.

The problem is evident in the code of the two interpreters. Subst goes inside the body of procs to replace variable values, but the dsub interpreter doesn't enter the body of a proc until the proc is called. To properly delay substitution inside procs, we need to remember the environment that was in effect when we encountered the proc and use it to perform substitution in the body of the proc. Thus, we need a new data structure that associates procs and environments. That data structure is called a closure,

(define-struct closure (proc env))

Eval must change in two places to properly use closures: the proc? case must return a closure instead of a proc, and the apply case will now get a closure, rather than a proc, as the apply-func value. See the corrected interpreter code for the details. Note that do-apply no longer takes the current env as the environment, as the env it extends comes from inside the closure.

Closures are one of the most important concepts in this course. Whenever you have the ability to nest function or (object-oriented) class definitions, you need closures to get static scoping. When a language claims to provide "first-class functions" or "first-class classes" (meaning functions and classes that can be returned from and passed to other functions, stored in data structures, etc), make sure it implements a form of closures. Otherwise, you will have to construct the closures by hand to achieve static scoping.

Questions

Do we have to worry about multiple uses of the same parameter name?

The implementations of extend-env and lookup-env can (and must) handle this issue. Consider the following program:
```
(make-apply 
 (make-proc 'x (make-apply 
                (make-proc 'x (make-var 'x))
                3))
 4)          
```
This program uses x twice, and should return 3 (the value of the innermost x).

Let's trace a call to eval on this program:
```
(eval (make-apply 
       (make-proc 'x (make-apply 
		      (make-proc 'x (make-var 'x))
		      3))
       4)
      (empty-env))

=>

(eval (make-apply 
       (make-proc 'x (make-var 'x))
       3)
      (list (make-dsub (make-var 'x) 4)))

=>

(eval (make-var 'x)
      (list (make-dsub (make-var 'x) 3) (make-dsub (make-var 'x) 4)))
```
Notice how the value of x that we really want is earlier in the environment (since we implemented extend-env with cons). Thus, as long as lookup-env returns the first value it finds for a variable, we will get the most recent binding, as desired. So, extend-env and lookup-env have to cooperate to implement multiple bindings for the same variable properly, but it isn't hard to do.

This page maintained by Kathi Fisler
Department of Computer Science Worcester Polytechnic Institute

CS 2135: Programming Language Concepts Notes on Delayed Substitution

Motivation

Questions

CS 2135: Programming Language Concepts
Notes on Delayed Substitution