|
The Lex MetalanguageThe LEX metalanguage input consists of three parts: (1) definitions which give names to sequences of characters, (2) translation rules which define tokens and allow association of a number representing the class, and (3) user-written procedures in C which are associated with a regular expression. Each of these three sections is separated by a double percent sign, "%%". The input has the following form:
The Definition section consists of a sequence of Names, each followed by an Expression to be given that name. The Rules consist of a sequence of regular expressions, each followed by an (optional) action which returns a number for that regular expression and performs other actions if desired, such as installing the characters in the token into a name table. The User-Written Procedures are the code for the C functions invokes as actions in the
rules. Thus, the form for each section is: Name1 Expression1 C function1 A section may be empty, but the "%%" is still needed. Like UNIX itself, the metalanguage is case-sensitive, that is, "A" and "a" are read as different characters. Example 4 shows a LEX description for our language consisting of assignment statements. There are other ways of expressing the tokens described here. Send questions and comments to: Karen Lemone |