Compilers: Module 4: Top-Down Parsing

Problems #1,2,3,6
 1.  For the expression grammar
 
      E -- > E + T | T 
      T -- > T * F | F 
      F -- > (E) | Id 
 
      find (a) FIRST (T), FIRST (T * Id) and (b) FOLLOW (E).
 
   ***** SOLUTION *****:
   (a) FIRST (T) is the set of all leftmost terminals derivable
           from the productions of T.
           Thus FIRST (T) = { Id, ( }
       FIRST (T*Id) = ( Id, ( }
   (b) FOLLOW (E) is the set of all leftmost terminals that can
           follow E.
           Thus FOLLOW (E) = { +, ) }
 
 
 2.  Show that the following grammar is not LL(1)
 
      A -- > d A 
      A -- > d B 
      A -- > f 
      B -- > g
 
   ***** SOLUTION *****:
   An LL(1) grammar must satisfy two conditions.  The first condition
           is: if a variable has more than one possible production,
           none of the derivations from those productions must have the
           same members in the 'FIRST' set. Otherwise (since this is 
           a single lookahead parse) the parser would not know which 
           production to use.
   Thus, since A -- > d A | d B, and
           FIRST (d A) = {d}
           FIRST (d B) = {d}
           this grammar is not LL(1), since there's common member 'd'.
 
 
 3.  Show that the following grammar is not LL(1).
 
      S -- > X d 
      X -- > C 
      X -- > B a 
      C -- >  
      B -- > d
 
   ***** SOLUTION *****:
   From the solution to exercise 2, I've stated the first of two
           conditions that must be satisfied for a grammar to be 
           LL(1).  The second condition is: for all variables, V, 
           with multiple productions, V -- > A, V --> B, given one
           of the productions of V goes to  the FIRST 
           set of the derivation other production must not have 
           any common members with the FOLLOW set of V.
   Thus, since X -- > C | B a  and  C --> 
           the other derivation is X -- > B a, so
           FIRST (B a) = {d}
           FOLLOW (X)  = {d}
           this grammar is not LL(1), since there's a common member 'd'
 
 
 6.  Given the grammar:
 
      S -- > XX 
      X -- >xX 
      X -- > y
 
      (a) Show that it is LL(1). 
      (b) Create a parsing table. 
      (c) Parse the string xyyx .
      (d) Draw the parse tree.
 
   ***** SOLUTION *****
   (a) To show that the grammar is LL(1), for all variables, V, 
           with multiple productions, V -- > A, V --> B,
           (1) the set FIRST (A) must not have any common members 
               with FIRST (B)
           (2) if one of the productions of V goes to 
               the FIRST set of the derivation other production
               must not have any common members with the FOLLOW
               set of V.
 
           So for the above grammar
              Condition 1:
               Since X -- > xX | y
                 FIRST (xX) = {x}
                 FIRST (y)  = {y}  (condition satisfied)
              Condition 2:
               Since there are no epsilon productions in this 
               grammar, this condition is also satisfied.
 
           This grammar *is* LL(1).
 
   (b)    |     x     |     y     |
       ===*===========*===========|
        S |  S -- > XX |  S --> XX |
       ---+-----------+-----------|
        X |  X -- > xX |  X --> y  |
 
   (c) Start with S in the stack and the input string in
       the input.  We then look at the leftmost symbol in the 
       stack and compare it with the leftmost symbol in the 
       input.  If both are the same terminal symbol we remove
       it from the stack and the input.  If the stack contains
       a variable (on the leftmost side), we use the table above
       to figure out what production is needed, and derive
       the variable in the stack.
 
       This process repeats until there is nothing left on
       both the stack and the input -- this is the accepting 
       state.  A syntax error occurs when 
            (1) leftmost terminal symbols do not match,
 		 (e.g. Stack: xX$, Input: yX$)
            (2) the stack is empty and the input string still
                has input, or vice versa,
                  (e.g. Stack: $, Input: xxy$,
                     or Stack: XX$, Input: $)
            (3) there are no productions in the table for a
                pair of leftmost symbols.
                   (e.g. Stack X$, Input: z$ (see table))
 
        Stack    Input    Production
        =====    =====    ==========
           S$     xyy$     S -- > XX
          XX$     xyy$     X -- > xX
         xXX$     xyy$     
          XX$      yy$     X -- > y
          yX$      yy$    
           X$       y$     X -- > y
           y$       y$    
            $        $    
       
       This input string is accepted...
 
   (d)  S
        |
        |----------\
        |          |
        X          X
        |          |
        |----\     |
        |    |     |
        x    X     y
             |
             |
             |
             y
Back To Jimmy's Page