/The following is a reformatting of the script from which the 2007 talk was
delivered. The left-hand column shows when to advance to the next slide (or,
in one solitary case, when to back up to the previous slide; "+" to advance,
"-" to back up). The slide titles are interspersed with the dialog on the
right, in brackets [], as a sanity check on position in the talk. Some
subtleties of delivery are suggested by "underlining" (mild emphasis, delimited
by underscores) or "boldfacing" (strong emphasis, delimited by asterisks); and
by putting vertical space before some slide-advances but not others.
--- John N. Shutt,
September 21, 2015/
. [slide: title]
I've developed a family of calculi called *vau calculi* for formal reasoning
about fexprs. That symbol at the beginning of the title is how I've chosen
to typeset a lower-case letter vau. It's a pre-classical Greek letter
related to Roman F, as in "FEXPR".
+ [slide: outline]
The _factual conclusion_ of this talk is that it's _possible_ to reason
formally about fexprs. But the _unifying theme_ of the talk is a
reassessment of which parts of Lisp are modeled by which parts of
lambda-calculus.
+
I'll start by explaining what fexprs are, and suggest why they may be
important.
+
I'll explain why the _traditional_ way of modeling Lisp with lambda-calculus
makes it impossible to usefully model fexprs;
+
and then I'll show a _non_-traditional way of modeling Lisp, with a
lambda-_like_ calculus, that *can* usefully model fexprs.
+
I'll say a few words about the implied non-traditional relationship between
lambda of Lisp and lambda of lambda-calculus.
+
And then I'll wrap up.
+ [slide: terminology]
First some terminology for Lisp evaluation.
+
A pair that has been _committed_ to evaluation, but not yet evaluated, is a
*combination*.
+
The first element of a combination is the *operator*; and in the usual case
that the combination is a list, the rest of the list elements are called
*operands*.
+
When the operands are all evaluated, the resulting values are called
*arguments*.
All these terms so far are from the Wizard Book, Abelson and Sussman, so
they're pretty standard, at least for Scheme. There's no *general* Scheme
term for the result of evaluating an operator, because in Scheme, not all
operators are evaluated: special form operators are not.
+
So let's say that a _combiner_ is the action 'designated' by an operator,
whether by the operator evaluating to it, or by the operator being a reserved
word for it.
We can partition combiners into two kinds: those that act on their
arguments, which is what Scheme procedures do,
+
and those that act on their *operands*, which special forms do.
+
A combiner that acts on its operands is
+
*operative*, and, by slightly skewed symmetry, a combiner that acts on its
arguments is
+
*applicative*. (As in 'apply'.)
Now. Fexprs have two defining characteristics.
+ [slide: fexprs]
They're _first-class operatives_ --- so they can be passed as arguments (or
operands), returned as values of general expressions, and so on. In a Lisp
that supports first-class operatives, the interpreter can't rely on a fixed
set of reserved operators to identify all operative combinations; so it can't
know, in general, whether to evaluate the operands until after it evaluates
the operator, to see whether the resulting combiner is applicative rather
than operative.
Also, to qualify as fexprs,
+
the first-class operatives have to act by evaluating their bodies. Ordinary
Lisp applicatives work this way; so fexpr constructors naturally tend to look
similar to lambda, the Lisp constructor of applicatives.
In contrast, _macros_ *don't* work this way. A macro doesn't _evaluate_
its body, it *expands* its body, which is a qualitatively different kind of
transformation from Lisp evaluation, that acts indirectly on computation
through a logically separate phase of processing.
Fexprs are a more direct and integrated alternative to macros in Lisp.
+ They existed in Lisp before macros were added to the language in the early
1960s, and coexisted with macros until about 1980. At *that* time, fexprs
were abandoned in favor of macros, despite the straightforwardness advantage
to fexprs, because fexprs were _severely_ ill-behaved. I believe that the
_worst_ ill-behavedness of fexprs _then_ was largely spill-over from the
ill-behavedness of dynamic scope, which was itself banished from Lisp a few
years later. Much more than that ---and this speaks to why fexprs may be
important--- I also believe that fexprs, *when handled properly* in a
language design, are very near to some sort of theoretical maximum of
abstractive power:
+
they subsume Scheme's first-class procedures, which in turn are known to
subsume Actors, and thus, broadly, the Object-Oriented Paradigm; and they're
more powerful than macros, easier to use, and _potentially_, less
error-prone, combining _surgical changes_ to the language evaluator with the
_elegance_ of first-class procedures. In attempting to realize this
abstractive potential, one of the several first things to do is to clear
fexprs of their most damning ill-behavedness charge, that it's impossible to
construct a nontrivial formal semantic theory of fexprs.
Which brings me back to calculi, and why lambda-calculus _hasn't_ been used
for reasoning about fexprs.
+ [slide: lambda-calculus]
Here's a fairly standard development of lambda-calculus.
+ A term is either
+ a variable,
+ or a constant,
+ or a combination,
+ or a lambda abstraction.
There's just one reduction schema,
+ called the beta rule, that says how to apply a lambda-abstraction to an
argument. It simply substitutes
+ the argument
+ for the parameter
+ into the body of the abstraction.
+ The reduction step relation of the calculus is the *compatible closure* of
the reductions specified by the schema.
+ Compatibility _means_, by definition,
+ that if you can reduce T1 when it's a standalone term, you can also reduce
it when it occurs as a subterm in any larger context C.
Reduction also has the useful property that it _implies_
+ *contextual equivalence*; that is,
+ T1, _reducible_ to T2, is *interchangeable* with T2 in _any_ context,
without changing _any_ observable result.
Contextual equivalence is an important part of the _point_ of the exercise.
If we can show, for example, that two Lisp expressions are modeled by
contextually equivalent calculus terms, then an optimizing compiler can
replace one Lisp expression with the other. And reduction implies contextual
equivalence; so we can use reduction to reach these useful conclusions.
When McCarthy invented Lisp, about 1960, he wanted a device
+ for writing an anonymous function as an algebraic expression, and he chose to
call it "lambda", after the notation in lambda-calculus. And this has led to
the *perception* that lambda of Lisp is a notation for lambda of
lambda-calculus. So a Lisp expression like
+
this, compiles into lambda-calculus something like
+
this. I say 'compiles'; what I mean precisely is that the rewriting
behavior of this lambda-calculus term is taken to be a mathematical model
of what happens when a Lisp interpreter evaluates
- this Lisp expression.
+
The 'compilation' maps Lisp lambda to calculus lambda, and otherwise,
Lisp _pairs_ to calculus _combinations_, and Lisp _symbols_ to calculus
_variables_. So all Lisp expressions are implicitly to be evaluated, and
term reduction is _just_ a mathematical model of expression evaluation. And
this *choice* of mathematical model causes a problem with introducing fexprs
_into the modeling calculus_. For anyone familiar with Mitch Wand's 1998
paper "The Theory of Fexprs is Trivial", that paper is about this problem.
+
To demonstrate the problem, minimally, suppose the lambda-calculus is
extended to include an operative QUOTE, that causes its operand not to be
evaluated ---a behavior that would be readily available if we had fexprs---
+
and consider when two terms are equivalent in _this_ context. We need
reduction to imply contextual equivalence, so that we can _deduce_ useful
equivalences. But our choice of mathematical model equates reduction with
evaluation; and if T1 *evaluates* to T2, that certainly *doesn't* imply that
'quote T1' _evaluates_ to 'quote T2',
+
_nor_ that 'quote T1' is _interchangeable_ with 'quote T2'.
+
My solution is to construct a calculus around a strategy of *explicit
evaluation*.
+
That is, evaluation is a *special case* of term reduction, that has to be
catalyzed by some explicit local syntax. So
+ 'T1' is not the same thing as
+ 'evaluate T1', and
+ 'T1 reduces to T2' is not the same thing as
+ 'T1 evaluates to T2'. An operative such as QUOTE
+
doesn't have to *suppress* reduction of its operands,
+
because reduction of its operands can only reduce them to normal forms _for
the operands_; those normal forms will not then be evaluated, so long as the
operative doesn't generate explicit instructions to do so. *Applicatives*
have to generate explicit instructions, to evaluate their operands.
To bring this technique to bear on the study of Lisp, we alter our choice
of mathematical model. Remember, we used to model _interpretation_ of a Lisp
expression,
+
such as this, by the behavior of a lambda-calculus term constructed by
mapping Lisp lambda to calculus lambda and so on, like
+ this; but in the new approach, we'll model interpretation of the Lisp
expression *explicitly*, like
+ this --- for
+ suitable environment E. The behavior of this term, in our new calculus,
should tell us _just_ what is supposed to happen when a Lisp interpreter
evaluates this expression. In essence, the old approach
+ was trying to model the execution of a pre-compiled program, and got into
trouble modeling fexprs because fexprs can really only be partially compiled;
+ whereas the new approach explicitly includes the partial-compilation in the
model.
Notice the environment in this notation.
+
Environments are necessary for practical use of fexprs, so a calculus for
usefully reasoning about fexprs in practice, has to include environments.
Vau-calculus terms have lots of environment subterms, like the one here;
whereas lambda-calculus doesn't have environments at all. The connection
between vau-calculi and lambda-calculi, though, is startlingly concrete, and
I overlooked this for several years of work with vau-calculi --- until summer
of last year, when I happened to construct a vau-calculus *without*
environments.
So
+ [slide: vau-x-calculus]
here is a vau calculus without environments. This will be several times
bigger than lambda-calculus; remember that this calculus has to contain, in
effect, a complete Lisp interpreter --- minus, in this case, symbols and
environments.
First, the syntax.
+ A *self-evaluating* term
+ is either a constant,
+ a vau abstraction ---that's an operative---,
+ or either of two _parameter-less_ operative abstractions, which we'll use to
parse operand lists; we'll see how these work in a moment.
+ A *term*
+ is either a variable,
+ a self-evaluating term,
+ a pair; nil, the empty list, is a constant, which is already covered under
self-evaluating terms, so we don't have to list nil separately;
+ an applicative; that's just a wrapper to induce operand evaluation, around
another stand-alone term;
+ a request to evaluate another term,
+ or a request to call a combiner with an operand list.
Now the schemata.
There is no schema for evaluating a variable, because a request to evaluate a
variable,
+ like this, isn't _supposed_ to be reducible; it's just a commitment to
perform an evaluation later, once the variable has been substituted for.
+
+ A self-evaluating term evaluates to itself.
+ To evaluate a pair, evaluate the operator and call the resulting combiner
with the unevaluated operand list; we _don't_ commit to evaluating the
operands at this point, because the combiner might later turn out to be
operative.
+ To evaluate an applicative, evaluate its interior. There's nothing deep in
this; it just means applicatives are unencapsulated compound data structures,
rather than abstractions.
+ A vau-zero abstraction parses nil, by dropping it and moving on;
+ and a vau-two abstraction parses a pair, by currying.
+ The heavy logic is in the last two schemata, which handle calls to
applicatives and vau abstractions.
+ An applicative is just a wrapper to induce operand evaluation, around
an underlying combiner to act on the arguments; so to call an applicative,
+ remove the wrapper, and commit to evaluating the operands.
And finally,
+ to combine a vau abstraction with an operand,
+ substitute the operand for the parameter into the body of the abstraction.
This last schema should look familiar. It's called the beta rule, and it is
*exactly* the sole schema of ordinary lambda-calculus --- just
+ expressed with different superficial syntax.
Most of what's important about vau-x-calculus comes down to this. The
subset of vau-x-calculus that uses only variables, constants, combines, and
vau abstractions
+ is exactly lambda-calculus. Once you realize that, it's easy to prove
that vau-x is Church-Rosser and such, and that vau-x is, technically, a
*conservative extension* of lambda-calculus --- the latter being a corollary
of the stronger result, _in this case_,
+ that reduction of lambda-calculus terms is just the same in vau-x-calculus
as it is in lambda-calculus.
In the big picture,
+
vau abstractions are operatives,
+
and they're also just a different notation for lambda abstractions,
therefore lambda abstractions are operatives. The syntax for *specifying*
operand evaluations isn't even in the lambda-calculus subset; there'd be no
use for it, because the only things that evaluate nontrivially are symbols
and pairs, and lambda-calculus doesn't have either. Lambda-calculus consists
of terms in which all symbols have already been converted to variables, and
all pairs have already been converted to combinations.
+
So lambda-calculus is a calculus *of* _operatives_, _not_ applicatives.
We got here by first adopting a perspective in which Lisp lambda isn't
modeled directly by calculus lambda; and now we've found that calculus lambda
constructs operatives, that don't evaluate their operands, whereas Lisp
lambda constructs applicatives, that do evaluate their operands. Why, then,
does it make sense to name the Lisp constructor of applicatives after the
calculus constructor of operatives?
+ [slide: simulation]
Well, it turns out, it's because of the way evaluating source code differs
from executing a pre-compiled image.
+
Take a typical term in lambda-calculus. With suitable delta-rules for
plus and times, this reduces to an integer in three steps.
Plus two three is five,
+
substitute five into the body,
+
and times five five is twenty-five.
+
Now,
+ this is an _active_ term; in our vau-x notation it would be written like
+ this; but if you were writing Lisp source code, you'd write it
+ like this, and this isn't an active term. It's a passive data structure.
You don't expect the data structure to reduce to twenty five; you expect
+
*evaluating* the data structure, in a standard environment,
+ to reduce to twenty five.
Well, as the first step of the reduction,
+
eval of a pair reduces to combine of eval of the operator with the operand
list.
+
So.
The combiner subterm
+ is a call to lambda, which reduces, in three steps,
+ to this. This combiner does all the
+ evaluation,
+ substitution,
+ and parsing needed --- whereas, in the original _lambda-calculus_ expression,
+
there _wasn't_ any evaluation or parsing needed, because those things had
already been done during compilation from Lisp to lambda-calculus. Lisp
lambda expressions have to evaluate to applicatives so that they can
*simulate*, through source code evaluation, the much simpler behavior of
calculus lambda.
+ (Continuing the reduction:)
Out at the top level, we're calling an applicative, so unwrap it and
evaluate the operand.
+
And by the way, even though fexprs are the motivation for all of this work,
+ *this* is the only bona fide fexpr in this talk: it *doesn't* evaluate its
operand (which in this case is the _result_ of an evaluation), and it
+ *does* evaluate its body. The moral is, that Lisp lambda constructs fexprs
with applicative wrappers around them.
Getting back to the reduction,
+ the evaluated body
+ can't be reduced to an integer, but it can be reduced part way, by five
reductions steps. That's compatibility at work.
+
Vau-two curries the argument list,
+
and vau substitutes five for X into its body.
+
The inner redex, combine times five five, reduces to 25
+
and vau-zero matches the rest of the argument list.
+
If you count it up, that's twenty vau-calculus reduction steps to simulate
three lambda-calculus reduction steps. That doesn't necessarily say
anything, one way or another, about the efficiency of *actual computation*,
which would involve the sizes of the steps, and which steps are performed by
a compiler prior to runtime; but because a calculus like this is a _tool for
formal reasoning_, the larger number of steps *is* a reflection of the fact
that there's a lot more to reason about --- the whole evaluation process, in
fact.
+ [slide: conclusion]
So.
+ In the traditional use of lambda-calculus to model Lisp, since evaluation is
implicit,
+ evaluation and reduction are indistinguishable; therefore,
+ fexprs can only prevent operand evaluation by preventing operand reduction,
which prevents reduction from implying contextual equivalence.
+ Explicit evaluation
+ makes term evaluation a _special case_ of term reduction;
+ so an operand can be reduced without being evaluated, and fexprs can be
reconciled with nontrivial theory.
+ The resulting vau-calculus
+ still contains lambda-calculus embedded within it (more or less disguised,
depending on whether you include environments).
+ And the nature of the embedding suggests that lambda-calculus is naturally
understood *as* a calculus of operatives. It was never necessary to add
operatives to lambda-calculus, because they were already there; what needed
to be added was *applicatives*, and with them, evaluation and parsing.
+ And on a practical note, here's the URL of my web page on the Kernel
programming language, which supports fexprs.