4.0 Introduction

4.1 Metalanguage

4.2 LL(1) Parser Driver

4.3 Generating LL(1) Parsing Tables

4.4 Recursive Descent Parsing

4.5 Error Handling

4.6 Summary

Web References

Exercises

4.4 Recursive Descent Parsing

LL(1) parsing is often called "table-driven recursive descent". Recursive descent parsing uses recursive procedures to model the parse tree to be constructed. For each nonterminal in the grammar, a procedure, which parses a nonterminal, is constructed. Each of these procedures may read input, match terminal symbols or call other procedures to read input and match terminals in the right-hand side of a production.

Consider the Assignment Statement Grammar from Example 1:



     Program --> Statements
     Statements --> Statement Statements'
     Statements' -->  | Statements
     Statement --> AsstStmt ;
     AsstStmt --> Id = Expression
     Expression --> Term Expression '
     Expression' --> + Term Expression'
                     | 
      Term     --> Factor Term'
     Term'   --> * Factor Term'
                 | 
     Factor --> (Expression)
            | Id
           | Lit

In recursive descent parsing, the grammar is thought of as a program with each production as a (recursive) procedure. Thus, there would be a procedure corresponding to the nonterminal Program. It would process the right-hand side of the production. Since the right-hand side of the production is itself a nonterminal, the body of the procedure corresponding to Program will consist of a call to the procedure corresponding to Statements. Roughly, the procedures are as follows:

        Procedure Program
        BEGIN { Program }
                Call Statements     {Statements}
                PRINT ("Program found")
        END { P }

The procedure corresponding to Statements also consists of calls corresponding to the nonterminals on the right-hand side of the production:

        Procedure Statements
        BEGIN { Statements }
                Call Statement 
                Call Statements' 
                PRINT ("Statements found")
        END { Statements }

Skipping to the procedure corresponding to AsstStatement:

        Procedure AsstStatement
        BEGIN { AsstStatement }
                If NextToken = "Id" THEN
                        BEGIN
                        PRINT ( " Id found " )
                        Advance past it
                        If NextToken = " =  " THEN
                                BEGIN  { IF }
                                        PRINT  (" = found")
                                        Advance past it
                                        Call E
                                END  { IF }
                        END
                PRINT  ( "AsstStatement found" )
        END  { AsstStatement }

Procedures Expression and Expression' would be:

        Procedure  Expression
        BEGIN  { Expression }
                Call  Term
                Call  Expression'
                PRINT ("Expression found")
        End  { Expression }

    

Procedure Expression' BEGIN { Expression' } IF NextSymbol in input = " + " THEN BEGIN { IF } PRINT ( "+ found" ) Advance past it Call Term Call Expression PRINT ("Expression' found") END { IF } END { Expression' }

This procedure works fine for correct programs since it presumes that if "+" is not found, then the production is E' . .

Procedures Term and Term' are similar since the grammar productions for them are similar. Procedure Factor could be written:

        Procedure Factor
        BEGIN { Factor }
        CASE NextSymbol is
"(" :   PRINT ( "( found " )
        Advance past it
        Call Expression
        If NextSymbol =")" THEN
        BEGIN { IF }
                PRINT ( " ) found")
                Advance past it
                PRINT ( "Factor found")
        END { IF }
        ELSE
                error
"Id":           PRINT ("Id found")
                Advance past it
                PRINT (" Factor found ")
Otherwise:      error
        END { Factor }

Here, NextSymbol is the token produced by the lexical analyzer. To advance past it, a call to get the next token might be made. The PRINT statements will print out the reverse of a left derivation and then the parse tree can be drawn by hand or a procedure can be written to stack up the information and print it out as a real left-most derivation, indenting to make the levels of the "tree" clearer.

Recursive descent parsing has the same restrictions as the general LL(1) parsing method described earlier. Recursive descent parsing is the method of choice when the language is LL(1), when there is no tool available, and when a quick compiler is needed. Turbo Pascal is a recursive descent parser.

Send questions and comments to: Karen Lemone