RAGs - Background

This page is Copyright John N. Shutt 1996. Here's what you're allowed to do with it.
Last modified: 06-Nov-07.

Background

In the second century B.C., the Greek philosopher Dionysius Thrax wrote a book about the structure of the Greek Language. His book — the Techne — classified words into eight parts of speech. This is an example of a descriptive grammar, in which correct sentences are described by making statements about them. For the next two thousand years, virtually all grammars were descriptive and traced their lineage directly back to the Techne.

Traditional descriptive grammars lacked mathematical rigor. In the 1950s, Noam Chomsky proposed a generative approach to grammar, in which correct sentences are generated by an abstract engine. Chomsky proposed a single abstract engine for all languages; each individual language is then defined by a set of rules that tell the engine how to generate all sentences of that language, and no others. This model is much more amenable than its predecessors to formal treatment. Since then, nearly all formal work in syntax — and particularly programming language syntax, with which the current thesis is concerned — has traced its lineage directly back to Chomsky.

Chomsky distinguished several classes of grammars, depending on the generality of form of the grammar rules. The most general forms of Chomsky grammars (types 0 and 1) are powerful, but opaque and difficult to work with. The more restricted (types 2 and 3), especially the context-free grammars or CFGs (type 2), are more lucid and easier to use, but not powerful enough to describe most programming languages.

Therefore, numerous extensions of the CFG model have been developed. Most of these extensions retain a CFG kernel, and augment it with a distinct facility that handles context-dependence. The most popular of these extensions are variations on a model introduced by Donald Knuth in the late 1960s, called attribute grammars. Unfortunately, in all the most powerful CFG extensions, including attribute grammars, the clarity of the CFG kernel is undermined by the power of the distinct extending facility.

A concept with some intuitive appeal is that of grammar adaptability. The basic principle is that declarations (and information hiding) effectively modify the grammar of the language. For example, when an integer variable k is declared, this adds a grammar rule to the effect that k is a valid integer expression. An adaptable grammar model implements this principle by providing a formal means for modifying the set of rules.

Since the adaptability principle is a natural way to understand programming languages — including their context-dependent features — intuitively, it is not unreasonable to hope that adaptable grammars could provide a particularly natural way to define the same programming languages formally. A number of adaptable grammar models have been proposed over the past twenty-odd years, and some of them are Turing-powerful (i.e., as powerful as general Chomsky grammars); but most of them lack the descriptive clarity of attribute grammars, let alone CFGs. For descriptive purposes, the most successful to date appears to be a model recently proposed by Henning Christiansen.

Even under Christiansen's model, there are still unresolved difficulties with the adaptable grammar approach. Some modern language features have not yet submitted to elegant description. Also, some of the drawbacks of attribute grammars have been inherited by Christiansen's model, which strongly resembles attribute grammars.

The purpose of this thesis is to develop a Turing-powerful adaptable grammar model that preserves, as much as possible, the elegance and simplicity of CFGs.

Next: Organization of the Thesis
Up: Chapter 0

This page is Copyright John N. Shutt 1996. Here's what you're allowed to do with it. Last modified: 06-Nov-07.

Background

This page is Copyright John N. Shutt 1996. Here's what you're allowed to do with it.
Last modified: 06-Nov-07.