/The following is a reformatting of the script from which the 2007 talk was
delivered.  The left-hand column shows when to advance to the next slide (or,
in one solitary case, when to back up to the previous slide; "+" to advance,
"-" to back up).  The slide titles are interspersed with the dialog on the
right, in brackets [], as a sanity check on position in the talk.  Some
subtleties of delivery are suggested by "underlining" (mild emphasis, delimited
by underscores) or "boldfacing" (strong emphasis, delimited by asterisks); and
by putting vertical space before some slide-advances but not others.
                                                       --- John N. Shutt,
                                                           September 21, 2015/


. [slide: title]

  I've developed a family of calculi called *vau calculi* for formal reasoning
  about fexprs.  That symbol at the beginning of the title is how I've chosen
  to typeset a lower-case letter vau.  It's a pre-classical Greek letter
  related to Roman F, as in "FEXPR".

+ [slide: outline]

  The _factual conclusion_ of this talk is that it's _possible_ to reason
  formally about fexprs.  But the _unifying theme_ of the talk is a
  reassessment of which parts of Lisp are modeled by which parts of
  lambda-calculus.
+
  I'll start by explaining what fexprs are, and suggest why they may be
  important.
+
  I'll explain why the _traditional_ way of modeling Lisp with lambda-calculus
  makes it impossible to usefully model fexprs;
+
  and then I'll show a _non_-traditional way of modeling Lisp, with a
  lambda-_like_ calculus, that *can* usefully model fexprs.
+
  I'll say a few words about the implied non-traditional relationship between
  lambda of Lisp and lambda of lambda-calculus.
+
  And then I'll wrap up.

+ [slide: terminology]

  First some terminology for Lisp evaluation.
+
  A pair that has been _committed_ to evaluation, but not yet evaluated, is a
  *combination*.
+
  The first element of a combination is the *operator*; and in the usual case
  that the combination is a list, the rest of the list elements are called
  *operands*.
+
  When the operands are all evaluated, the resulting values are called
  *arguments*.

  All these terms so far are from the Wizard Book, Abelson and Sussman, so
  they're pretty standard, at least for Scheme.  There's no *general* Scheme
  term for the result of evaluating an operator, because in Scheme, not all
  operators are evaluated:  special form operators are not.
+
  So let's say that a _combiner_ is the action 'designated' by an operator,
  whether by the operator evaluating to it, or by the operator being a reserved
  word for it.

  We can partition combiners into two kinds:  those that act on their
  arguments, which is what Scheme procedures do,
+
  and those that act on their *operands*, which special forms do.
+
  A combiner that acts on its operands is
+
 *operative*, and, by slightly skewed symmetry, a combiner that acts on its
  arguments is
+
  *applicative*.  (As in 'apply'.)

  Now.  Fexprs have two defining characteristics.

+ [slide: fexprs]

  They're _first-class operatives_ --- so they can be passed as arguments (or
  operands), returned as values of general expressions, and so on.  In a Lisp
  that supports first-class operatives, the interpreter can't rely on a fixed
  set of reserved operators to identify all operative combinations; so it can't
  know, in general, whether to evaluate the operands until after it evaluates
  the operator, to see whether the resulting combiner is applicative rather
  than operative.

  Also, to qualify as fexprs,
+
  the first-class operatives have to act by evaluating their bodies.  Ordinary
  Lisp applicatives work this way; so fexpr constructors naturally tend to look
  similar to lambda, the Lisp constructor of applicatives.

  In contrast, _macros_ *don't* work this way.  A macro doesn't _evaluate_
  its body, it *expands* its body, which is a qualitatively different kind of
  transformation from Lisp evaluation, that acts indirectly on computation
  through a logically separate phase of processing.

  Fexprs are a more direct and integrated alternative to macros in Lisp.
+ They existed in Lisp before macros were added to the language in the early
  1960s, and coexisted with macros until about 1980.  At *that* time, fexprs
  were abandoned in favor of macros, despite the straightforwardness advantage
  to fexprs, because fexprs were _severely_ ill-behaved.  I believe that the
  _worst_ ill-behavedness of fexprs _then_ was largely spill-over from the
  ill-behavedness of dynamic scope, which was itself banished from Lisp a few
  years later.  Much more than that ---and this speaks to why fexprs may be
  important--- I also believe that fexprs, *when handled properly* in a
  language design, are very near to some sort of theoretical maximum of
  abstractive power:
+
  they subsume Scheme's first-class procedures, which in turn are known to
  subsume Actors, and thus, broadly, the Object-Oriented Paradigm; and they're
  more powerful than macros, easier to use, and _potentially_, less
  error-prone, combining _surgical changes_ to the language evaluator with the
  _elegance_ of first-class procedures.  In attempting to realize this
  abstractive potential, one of the several first things to do is to clear
  fexprs of their most damning ill-behavedness charge, that it's impossible to
  construct a nontrivial formal semantic theory of fexprs.

  Which brings me back to calculi, and why lambda-calculus _hasn't_ been used
  for reasoning about fexprs.

+ [slide: lambda-calculus]

  Here's a fairly standard development of lambda-calculus.

+ A term is either
+ a variable,
+ or a constant,
+ or a combination,
+ or a lambda abstraction.

  There's just one reduction schema,
+ called the beta rule, that says how to apply a lambda-abstraction to an
  argument.  It simply substitutes
+ the argument
+ for the parameter
+ into the body of the abstraction.

+ The reduction step relation of the calculus is the *compatible closure* of
  the reductions specified by the schema.

+ Compatibility _means_, by definition,
+ that if you can reduce T1 when it's a standalone term, you can also reduce
  it when it occurs as a subterm in any larger context C.

  Reduction also has the useful property that it _implies_
+ *contextual equivalence*; that is,
+ T1, _reducible_ to T2, is *interchangeable* with T2 in _any_ context,
  without changing _any_ observable result.

  Contextual equivalence is an important part of the _point_ of the exercise.
  If we can show, for example, that two Lisp expressions are modeled by
  contextually equivalent calculus terms, then an optimizing compiler can
  replace one Lisp expression with the other.  And reduction implies contextual
  equivalence; so we can use reduction to reach these useful conclusions.

  When McCarthy invented Lisp, about 1960, he wanted a device
+ for writing an anonymous function as an algebraic expression, and he chose to
  call it "lambda", after the notation in lambda-calculus.  And this has led to
  the *perception* that lambda of Lisp is a notation for lambda of
  lambda-calculus.  So a Lisp expression like
+
  this, compiles into lambda-calculus something like
+
  this.  I say 'compiles'; what I mean precisely is that the rewriting
  behavior of this lambda-calculus term is taken to be a mathematical model
  of what happens when a Lisp interpreter evaluates
- this Lisp expression.
+
  The 'compilation' maps Lisp lambda to calculus lambda, and otherwise,
  Lisp _pairs_ to calculus _combinations_, and Lisp _symbols_ to calculus
  _variables_.  So all Lisp expressions are implicitly to be evaluated, and
  term reduction is _just_ a mathematical model of expression evaluation.  And
  this *choice* of mathematical model causes a problem with introducing fexprs
  _into the modeling calculus_.  For anyone familiar with Mitch Wand's 1998
  paper "The Theory of Fexprs is Trivial", that paper is about this problem.
+
  To demonstrate the problem, minimally, suppose the lambda-calculus is
  extended to include an operative QUOTE, that causes its operand not to be
  evaluated ---a behavior that would be readily available if we had fexprs---
+
  and consider when two terms are equivalent in _this_ context.  We need
  reduction to imply contextual equivalence, so that we can _deduce_ useful
  equivalences.  But our choice of mathematical model equates reduction with
  evaluation; and if T1 *evaluates* to T2, that certainly *doesn't* imply that
  'quote T1' _evaluates_ to 'quote T2',
+
  _nor_ that 'quote T1' is _interchangeable_ with 'quote T2'.
+
  My solution is to construct a calculus around a strategy of *explicit
  evaluation*.
+
  That is, evaluation is a *special case* of term reduction, that has to be
  catalyzed by some explicit local syntax.  So

+ 'T1' is not the same thing as

+ 'evaluate T1', and

+ 'T1 reduces to T2' is not the same thing as

+ 'T1 evaluates to T2'.  An operative such as QUOTE
+
  doesn't have to *suppress* reduction of its operands,
+
  because reduction of its operands can only reduce them to normal forms _for
  the operands_; those normal forms will not then be evaluated, so long as the
  operative doesn't generate explicit instructions to do so.  *Applicatives*
  have to generate explicit instructions, to evaluate their operands.

  To bring this technique to bear on the study of Lisp, we alter our choice
  of mathematical model.  Remember, we used to model _interpretation_ of a Lisp
  expression,
+
  such as this, by the behavior of a lambda-calculus term constructed by
  mapping Lisp lambda to calculus lambda and so on, like
+ this; but in the new approach, we'll model interpretation of the Lisp
  expression *explicitly*, like
+ this --- for
+ suitable environment E.  The behavior of this term, in our new calculus,
  should tell us _just_ what is supposed to happen when a Lisp interpreter
  evaluates this expression.  In essence, the old approach
+ was trying to model the execution of a pre-compiled program, and got into
  trouble modeling fexprs because fexprs can really only be partially compiled;
+ whereas the new approach explicitly includes the partial-compilation in the
  model.

  Notice the environment in this notation.
+
  Environments are necessary for practical use of fexprs, so a calculus for
  usefully reasoning about fexprs in practice, has to include environments.
  Vau-calculus terms have lots of environment subterms, like the one here;
  whereas lambda-calculus doesn't have environments at all.  The connection
  between vau-calculi and lambda-calculi, though, is startlingly concrete, and
  I overlooked this for several years of work with vau-calculi --- until summer
  of last year, when I happened to construct a vau-calculus *without*
  environments.

  So

+ [slide: vau-x-calculus]

  here is a vau calculus without environments.  This will be several times
  bigger than lambda-calculus; remember that this calculus has to contain, in
  effect, a complete Lisp interpreter --- minus, in this case, symbols and
  environments.

  First, the syntax.

+ A *self-evaluating* term
+ is either a constant,

+ a vau abstraction ---that's an operative---,

+ or either of two _parameter-less_ operative abstractions, which we'll use to
  parse operand lists; we'll see how these work in a moment.

+ A *term*

+ is either a variable,

+ a self-evaluating term,

+ a pair; nil, the empty list, is a constant, which is already covered under
  self-evaluating terms, so we don't have to list nil separately;

+ an applicative; that's just a wrapper to induce operand evaluation, around
  another stand-alone term;

+ a request to evaluate another term,

+ or a request to call a combiner with an operand list.

  Now the schemata.

  There is no schema for evaluating a variable, because a request to evaluate a
  variable,
+ like this, isn't _supposed_ to be reducible; it's just a commitment to
  perform an evaluation later, once the variable has been substituted for.
+
+ A self-evaluating term evaluates to itself.

+ To evaluate a pair, evaluate the operator and call the resulting combiner
  with the unevaluated operand list; we _don't_ commit to evaluating the
  operands at this point, because the combiner might later turn out to be
  operative.

+ To evaluate an applicative, evaluate its interior.  There's nothing deep in
  this; it just means applicatives are unencapsulated compound data structures,
  rather than abstractions.

+ A vau-zero abstraction parses nil, by dropping it and moving on;

+ and a vau-two abstraction parses a pair, by currying.

+ The heavy logic is in the last two schemata, which handle calls to
  applicatives and vau abstractions.

+ An applicative is just a wrapper to induce operand evaluation, around
  an underlying combiner to act on the arguments; so to call an applicative,

+ remove the wrapper, and commit to evaluating the operands.

  And finally,
+ to combine a vau abstraction with an operand,
+ substitute the operand for the parameter into the body of the abstraction.

  This last schema should look familiar.  It's called the beta rule, and it is
  *exactly* the sole schema of ordinary lambda-calculus --- just
+ expressed with different superficial syntax.

  Most of what's important about vau-x-calculus comes down to this.  The
  subset of vau-x-calculus that uses only variables, constants, combines, and
  vau abstractions
+ is exactly lambda-calculus.  Once you realize that, it's easy to prove
  that vau-x is Church-Rosser and such, and that vau-x is, technically, a
  *conservative extension* of lambda-calculus --- the latter being a corollary
  of the stronger result, _in this case_,
+ that reduction of lambda-calculus terms is just the same in vau-x-calculus
  as it is in lambda-calculus.

  In the big picture,
+
  vau abstractions are operatives,
+
  and they're also just a different notation for lambda abstractions,
  therefore lambda abstractions are operatives.  The syntax for *specifying*
  operand evaluations isn't even in the lambda-calculus subset; there'd be no
  use for it, because the only things that evaluate nontrivially are symbols
  and pairs, and lambda-calculus doesn't have either.  Lambda-calculus consists
  of terms in which all symbols have already been converted to variables, and
  all pairs have already been converted to combinations.
+
  So lambda-calculus is a calculus *of* _operatives_, _not_ applicatives.

  We got here by first adopting a perspective in which Lisp lambda isn't
  modeled directly by calculus lambda; and now we've found that calculus lambda
  constructs operatives, that don't evaluate their operands, whereas Lisp
  lambda constructs applicatives, that do evaluate their operands.  Why, then,
  does it make sense to name the Lisp constructor of applicatives after the
  calculus constructor of operatives?

+ [slide: simulation]

  Well, it turns out, it's because of the way evaluating source code differs
  from executing a pre-compiled image.
+
  Take a typical term in lambda-calculus.  With suitable delta-rules for
  plus and times, this reduces to an integer in three steps.

  Plus two three is five,
+
  substitute five into the body,
+
  and times five five is twenty-five.
+
  Now,

+ this is an _active_ term; in our vau-x notation it would be written like
+ this; but if you were writing Lisp source code, you'd write it
+ like this, and this isn't an active term.  It's a passive data structure.
  You don't expect the data structure to reduce to twenty five; you expect
+
  *evaluating* the data structure, in a standard environment,
+ to reduce to twenty five.

  Well, as the first step of the reduction,
+
  eval of a pair reduces to combine of eval of the operator with the operand
  list.
+
  So.

  The combiner subterm
+ is a call to lambda, which reduces, in three steps,

+ to this.  This combiner does all the
+ evaluation,

+ substitution,

+ and parsing needed --- whereas, in the original _lambda-calculus_ expression,
+
  there _wasn't_ any evaluation or parsing needed, because those things had
  already been done during compilation from Lisp to lambda-calculus.  Lisp
  lambda expressions have to evaluate to applicatives so that they can
  *simulate*, through source code evaluation, the much simpler behavior of
  calculus lambda.

+ (Continuing the reduction:)
  Out at the top level, we're calling an applicative, so unwrap it and
  evaluate the operand.
+
  And by the way, even though fexprs are the motivation for all of this work,

+ *this* is the only bona fide fexpr in this talk:  it *doesn't* evaluate its
  operand (which in this case is the _result_ of an evaluation), and it
+ *does* evaluate its body.  The moral is, that Lisp lambda constructs fexprs
  with applicative wrappers around them.

  Getting back to the reduction,
+ the evaluated body
+ can't be reduced to an integer, but it can be reduced part way, by five
  reductions steps.  That's compatibility at work.
+
  Vau-two curries the argument list,
+
  and vau substitutes five for X into its body.
+
  The inner redex, combine times five five, reduces to 25
+
  and vau-zero matches the rest of the argument list.
+
  If you count it up, that's twenty vau-calculus reduction steps to simulate
  three lambda-calculus reduction steps.  That doesn't necessarily say
  anything, one way or another, about the efficiency of *actual computation*,
  which would involve the sizes of the steps, and which steps are performed by
  a compiler prior to runtime; but because a calculus like this is a _tool for
  formal reasoning_, the larger number of steps *is* a reflection of the fact
  that there's a lot more to reason about --- the whole evaluation process, in
  fact.

+ [slide: conclusion]

  So.

+ In the traditional use of lambda-calculus to model Lisp, since evaluation is
  implicit,
+ evaluation and reduction are indistinguishable; therefore,
+ fexprs can only prevent operand evaluation by preventing operand reduction,
  which prevents reduction from implying contextual equivalence.

+ Explicit evaluation
+ makes term evaluation a _special case_ of term reduction;
+ so an operand can be reduced without being evaluated, and fexprs can be
  reconciled with nontrivial theory.

+ The resulting vau-calculus
+ still contains lambda-calculus embedded within it (more or less disguised,
  depending on whether you include environments).

+ And the nature of the embedding suggests that lambda-calculus is naturally
  understood *as* a calculus of operatives.  It was never necessary to add
  operatives to lambda-calculus, because they were already there; what needed
  to be added was *applicatives*, and with them, evaluation and parsing.

+ And on a practical note, here's the URL of my web page on the Kernel
  programming language, which supports fexprs.