/The following is a reformatting of the script from which the 2007 talk was delivered. The left-hand column shows when to advance to the next slide (or, in one solitary case, when to back up to the previous slide; "+" to advance, "-" to back up). The slide titles are interspersed with the dialog on the right, in brackets [], as a sanity check on position in the talk. Some subtleties of delivery are suggested by "underlining" (mild emphasis, delimited by underscores) or "boldfacing" (strong emphasis, delimited by asterisks); and by putting vertical space before some slide-advances but not others. --- John N. Shutt, September 21, 2015/ . [slide: title] I've developed a family of calculi called *vau calculi* for formal reasoning about fexprs. That symbol at the beginning of the title is how I've chosen to typeset a lower-case letter vau. It's a pre-classical Greek letter related to Roman F, as in "FEXPR". + [slide: outline] The _factual conclusion_ of this talk is that it's _possible_ to reason formally about fexprs. But the _unifying theme_ of the talk is a reassessment of which parts of Lisp are modeled by which parts of lambda-calculus. + I'll start by explaining what fexprs are, and suggest why they may be important. + I'll explain why the _traditional_ way of modeling Lisp with lambda-calculus makes it impossible to usefully model fexprs; + and then I'll show a _non_-traditional way of modeling Lisp, with a lambda-_like_ calculus, that *can* usefully model fexprs. + I'll say a few words about the implied non-traditional relationship between lambda of Lisp and lambda of lambda-calculus. + And then I'll wrap up. + [slide: terminology] First some terminology for Lisp evaluation. + A pair that has been _committed_ to evaluation, but not yet evaluated, is a *combination*. + The first element of a combination is the *operator*; and in the usual case that the combination is a list, the rest of the list elements are called *operands*. + When the operands are all evaluated, the resulting values are called *arguments*. All these terms so far are from the Wizard Book, Abelson and Sussman, so they're pretty standard, at least for Scheme. There's no *general* Scheme term for the result of evaluating an operator, because in Scheme, not all operators are evaluated: special form operators are not. + So let's say that a _combiner_ is the action 'designated' by an operator, whether by the operator evaluating to it, or by the operator being a reserved word for it. We can partition combiners into two kinds: those that act on their arguments, which is what Scheme procedures do, + and those that act on their *operands*, which special forms do. + A combiner that acts on its operands is + *operative*, and, by slightly skewed symmetry, a combiner that acts on its arguments is + *applicative*. (As in 'apply'.) Now. Fexprs have two defining characteristics. + [slide: fexprs] They're _first-class operatives_ --- so they can be passed as arguments (or operands), returned as values of general expressions, and so on. In a Lisp that supports first-class operatives, the interpreter can't rely on a fixed set of reserved operators to identify all operative combinations; so it can't know, in general, whether to evaluate the operands until after it evaluates the operator, to see whether the resulting combiner is applicative rather than operative. Also, to qualify as fexprs, + the first-class operatives have to act by evaluating their bodies. Ordinary Lisp applicatives work this way; so fexpr constructors naturally tend to look similar to lambda, the Lisp constructor of applicatives. In contrast, _macros_ *don't* work this way. A macro doesn't _evaluate_ its body, it *expands* its body, which is a qualitatively different kind of transformation from Lisp evaluation, that acts indirectly on computation through a logically separate phase of processing. Fexprs are a more direct and integrated alternative to macros in Lisp. + They existed in Lisp before macros were added to the language in the early 1960s, and coexisted with macros until about 1980. At *that* time, fexprs were abandoned in favor of macros, despite the straightforwardness advantage to fexprs, because fexprs were _severely_ ill-behaved. I believe that the _worst_ ill-behavedness of fexprs _then_ was largely spill-over from the ill-behavedness of dynamic scope, which was itself banished from Lisp a few years later. Much more than that ---and this speaks to why fexprs may be important--- I also believe that fexprs, *when handled properly* in a language design, are very near to some sort of theoretical maximum of abstractive power: + they subsume Scheme's first-class procedures, which in turn are known to subsume Actors, and thus, broadly, the Object-Oriented Paradigm; and they're more powerful than macros, easier to use, and _potentially_, less error-prone, combining _surgical changes_ to the language evaluator with the _elegance_ of first-class procedures. In attempting to realize this abstractive potential, one of the several first things to do is to clear fexprs of their most damning ill-behavedness charge, that it's impossible to construct a nontrivial formal semantic theory of fexprs. Which brings me back to calculi, and why lambda-calculus _hasn't_ been used for reasoning about fexprs. + [slide: lambda-calculus] Here's a fairly standard development of lambda-calculus. + A term is either + a variable, + or a constant, + or a combination, + or a lambda abstraction. There's just one reduction schema, + called the beta rule, that says how to apply a lambda-abstraction to an argument. It simply substitutes + the argument + for the parameter + into the body of the abstraction. + The reduction step relation of the calculus is the *compatible closure* of the reductions specified by the schema. + Compatibility _means_, by definition, + that if you can reduce T1 when it's a standalone term, you can also reduce it when it occurs as a subterm in any larger context C. Reduction also has the useful property that it _implies_ + *contextual equivalence*; that is, + T1, _reducible_ to T2, is *interchangeable* with T2 in _any_ context, without changing _any_ observable result. Contextual equivalence is an important part of the _point_ of the exercise. If we can show, for example, that two Lisp expressions are modeled by contextually equivalent calculus terms, then an optimizing compiler can replace one Lisp expression with the other. And reduction implies contextual equivalence; so we can use reduction to reach these useful conclusions. When McCarthy invented Lisp, about 1960, he wanted a device + for writing an anonymous function as an algebraic expression, and he chose to call it "lambda", after the notation in lambda-calculus. And this has led to the *perception* that lambda of Lisp is a notation for lambda of lambda-calculus. So a Lisp expression like + this, compiles into lambda-calculus something like + this. I say 'compiles'; what I mean precisely is that the rewriting behavior of this lambda-calculus term is taken to be a mathematical model of what happens when a Lisp interpreter evaluates - this Lisp expression. + The 'compilation' maps Lisp lambda to calculus lambda, and otherwise, Lisp _pairs_ to calculus _combinations_, and Lisp _symbols_ to calculus _variables_. So all Lisp expressions are implicitly to be evaluated, and term reduction is _just_ a mathematical model of expression evaluation. And this *choice* of mathematical model causes a problem with introducing fexprs _into the modeling calculus_. For anyone familiar with Mitch Wand's 1998 paper "The Theory of Fexprs is Trivial", that paper is about this problem. + To demonstrate the problem, minimally, suppose the lambda-calculus is extended to include an operative QUOTE, that causes its operand not to be evaluated ---a behavior that would be readily available if we had fexprs--- + and consider when two terms are equivalent in _this_ context. We need reduction to imply contextual equivalence, so that we can _deduce_ useful equivalences. But our choice of mathematical model equates reduction with evaluation; and if T1 *evaluates* to T2, that certainly *doesn't* imply that 'quote T1' _evaluates_ to 'quote T2', + _nor_ that 'quote T1' is _interchangeable_ with 'quote T2'. + My solution is to construct a calculus around a strategy of *explicit evaluation*. + That is, evaluation is a *special case* of term reduction, that has to be catalyzed by some explicit local syntax. So + 'T1' is not the same thing as + 'evaluate T1', and + 'T1 reduces to T2' is not the same thing as + 'T1 evaluates to T2'. An operative such as QUOTE + doesn't have to *suppress* reduction of its operands, + because reduction of its operands can only reduce them to normal forms _for the operands_; those normal forms will not then be evaluated, so long as the operative doesn't generate explicit instructions to do so. *Applicatives* have to generate explicit instructions, to evaluate their operands. To bring this technique to bear on the study of Lisp, we alter our choice of mathematical model. Remember, we used to model _interpretation_ of a Lisp expression, + such as this, by the behavior of a lambda-calculus term constructed by mapping Lisp lambda to calculus lambda and so on, like + this; but in the new approach, we'll model interpretation of the Lisp expression *explicitly*, like + this --- for + suitable environment E. The behavior of this term, in our new calculus, should tell us _just_ what is supposed to happen when a Lisp interpreter evaluates this expression. In essence, the old approach + was trying to model the execution of a pre-compiled program, and got into trouble modeling fexprs because fexprs can really only be partially compiled; + whereas the new approach explicitly includes the partial-compilation in the model. Notice the environment in this notation. + Environments are necessary for practical use of fexprs, so a calculus for usefully reasoning about fexprs in practice, has to include environments. Vau-calculus terms have lots of environment subterms, like the one here; whereas lambda-calculus doesn't have environments at all. The connection between vau-calculi and lambda-calculi, though, is startlingly concrete, and I overlooked this for several years of work with vau-calculi --- until summer of last year, when I happened to construct a vau-calculus *without* environments. So + [slide: vau-x-calculus] here is a vau calculus without environments. This will be several times bigger than lambda-calculus; remember that this calculus has to contain, in effect, a complete Lisp interpreter --- minus, in this case, symbols and environments. First, the syntax. + A *self-evaluating* term + is either a constant, + a vau abstraction ---that's an operative---, + or either of two _parameter-less_ operative abstractions, which we'll use to parse operand lists; we'll see how these work in a moment. + A *term* + is either a variable, + a self-evaluating term, + a pair; nil, the empty list, is a constant, which is already covered under self-evaluating terms, so we don't have to list nil separately; + an applicative; that's just a wrapper to induce operand evaluation, around another stand-alone term; + a request to evaluate another term, + or a request to call a combiner with an operand list. Now the schemata. There is no schema for evaluating a variable, because a request to evaluate a variable, + like this, isn't _supposed_ to be reducible; it's just a commitment to perform an evaluation later, once the variable has been substituted for. + + A self-evaluating term evaluates to itself. + To evaluate a pair, evaluate the operator and call the resulting combiner with the unevaluated operand list; we _don't_ commit to evaluating the operands at this point, because the combiner might later turn out to be operative. + To evaluate an applicative, evaluate its interior. There's nothing deep in this; it just means applicatives are unencapsulated compound data structures, rather than abstractions. + A vau-zero abstraction parses nil, by dropping it and moving on; + and a vau-two abstraction parses a pair, by currying. + The heavy logic is in the last two schemata, which handle calls to applicatives and vau abstractions. + An applicative is just a wrapper to induce operand evaluation, around an underlying combiner to act on the arguments; so to call an applicative, + remove the wrapper, and commit to evaluating the operands. And finally, + to combine a vau abstraction with an operand, + substitute the operand for the parameter into the body of the abstraction. This last schema should look familiar. It's called the beta rule, and it is *exactly* the sole schema of ordinary lambda-calculus --- just + expressed with different superficial syntax. Most of what's important about vau-x-calculus comes down to this. The subset of vau-x-calculus that uses only variables, constants, combines, and vau abstractions + is exactly lambda-calculus. Once you realize that, it's easy to prove that vau-x is Church-Rosser and such, and that vau-x is, technically, a *conservative extension* of lambda-calculus --- the latter being a corollary of the stronger result, _in this case_, + that reduction of lambda-calculus terms is just the same in vau-x-calculus as it is in lambda-calculus. In the big picture, + vau abstractions are operatives, + and they're also just a different notation for lambda abstractions, therefore lambda abstractions are operatives. The syntax for *specifying* operand evaluations isn't even in the lambda-calculus subset; there'd be no use for it, because the only things that evaluate nontrivially are symbols and pairs, and lambda-calculus doesn't have either. Lambda-calculus consists of terms in which all symbols have already been converted to variables, and all pairs have already been converted to combinations. + So lambda-calculus is a calculus *of* _operatives_, _not_ applicatives. We got here by first adopting a perspective in which Lisp lambda isn't modeled directly by calculus lambda; and now we've found that calculus lambda constructs operatives, that don't evaluate their operands, whereas Lisp lambda constructs applicatives, that do evaluate their operands. Why, then, does it make sense to name the Lisp constructor of applicatives after the calculus constructor of operatives? + [slide: simulation] Well, it turns out, it's because of the way evaluating source code differs from executing a pre-compiled image. + Take a typical term in lambda-calculus. With suitable delta-rules for plus and times, this reduces to an integer in three steps. Plus two three is five, + substitute five into the body, + and times five five is twenty-five. + Now, + this is an _active_ term; in our vau-x notation it would be written like + this; but if you were writing Lisp source code, you'd write it + like this, and this isn't an active term. It's a passive data structure. You don't expect the data structure to reduce to twenty five; you expect + *evaluating* the data structure, in a standard environment, + to reduce to twenty five. Well, as the first step of the reduction, + eval of a pair reduces to combine of eval of the operator with the operand list. + So. The combiner subterm + is a call to lambda, which reduces, in three steps, + to this. This combiner does all the + evaluation, + substitution, + and parsing needed --- whereas, in the original _lambda-calculus_ expression, + there _wasn't_ any evaluation or parsing needed, because those things had already been done during compilation from Lisp to lambda-calculus. Lisp lambda expressions have to evaluate to applicatives so that they can *simulate*, through source code evaluation, the much simpler behavior of calculus lambda. + (Continuing the reduction:) Out at the top level, we're calling an applicative, so unwrap it and evaluate the operand. + And by the way, even though fexprs are the motivation for all of this work, + *this* is the only bona fide fexpr in this talk: it *doesn't* evaluate its operand (which in this case is the _result_ of an evaluation), and it + *does* evaluate its body. The moral is, that Lisp lambda constructs fexprs with applicative wrappers around them. Getting back to the reduction, + the evaluated body + can't be reduced to an integer, but it can be reduced part way, by five reductions steps. That's compatibility at work. + Vau-two curries the argument list, + and vau substitutes five for X into its body. + The inner redex, combine times five five, reduces to 25 + and vau-zero matches the rest of the argument list. + If you count it up, that's twenty vau-calculus reduction steps to simulate three lambda-calculus reduction steps. That doesn't necessarily say anything, one way or another, about the efficiency of *actual computation*, which would involve the sizes of the steps, and which steps are performed by a compiler prior to runtime; but because a calculus like this is a _tool for formal reasoning_, the larger number of steps *is* a reflection of the fact that there's a lot more to reason about --- the whole evaluation process, in fact. + [slide: conclusion] So. + In the traditional use of lambda-calculus to model Lisp, since evaluation is implicit, + evaluation and reduction are indistinguishable; therefore, + fexprs can only prevent operand evaluation by preventing operand reduction, which prevents reduction from implying contextual equivalence. + Explicit evaluation + makes term evaluation a _special case_ of term reduction; + so an operand can be reduced without being evaluated, and fexprs can be reconciled with nontrivial theory. + The resulting vau-calculus + still contains lambda-calculus embedded within it (more or less disguised, depending on whether you include environments). + And the nature of the embedding suggests that lambda-calculus is naturally understood *as* a calculus of operatives. It was never necessary to add operatives to lambda-calculus, because they were already there; what needed to be added was *applicatives*, and with them, evaluation and parsing. + And on a practical note, here's the URL of my web page on the Kernel programming language, which supports fexprs.