Compilers: Module 4: Top-Down Parsing
Problems #1,2,3,6
1. For the expression grammar E -- > E + T | T T -- > T * F | F F -- > (E) | Id find (a) FIRST (T), FIRST (T * Id) and (b) FOLLOW (E). ***** SOLUTION *****: (a) FIRST (T) is the set of all leftmost terminals derivable from the productions of T. Thus FIRST (T) = { Id, ( } FIRST (T*Id) = ( Id, ( } (b) FOLLOW (E) is the set of all leftmost terminals that can follow E. Thus FOLLOW (E) = { +, ) } 2. Show that the following grammar is not LL(1) A -- > d A A -- > d B A -- > f B -- > g ***** SOLUTION *****: An LL(1) grammar must satisfy two conditions. The first condition is: if a variable has more than one possible production, none of the derivations from those productions must have the same members in the 'FIRST' set. Otherwise (since this is a single lookahead parse) the parser would not know which production to use. Thus, since A -- > d A | d B, and FIRST (d A) = {d} FIRST (d B) = {d} this grammar is not LL(1), since there's common member 'd'. 3. Show that the following grammar is not LL(1). S -- > X d X -- > C X -- > B a C -- >B -- > d ***** SOLUTION *****: From the solution to exercise 2, I've stated the first of two conditions that must be satisfied for a grammar to be LL(1). The second condition is: for all variables, V, with multiple productions, V -- > A, V --> B, given one of the productions of V goes to the FIRST set of the derivation other production must not have any common members with the FOLLOW set of V. Thus, since X -- > C | B a and C --> the other derivation is X -- > B a, so FIRST (B a) = {d} FOLLOW (X) = {d} this grammar is not LL(1), since there's a common member 'd' 6. Given the grammar: S -- > XX X -- >xX X -- > y (a) Show that it is LL(1). (b) Create a parsing table. (c) Parse the string xyyx . (d) Draw the parse tree. ***** SOLUTION ***** (a) To show that the grammar is LL(1), for all variables, V, with multiple productions, V -- > A, V --> B, (1) the set FIRST (A) must not have any common members with FIRST (B) (2) if one of the productions of V goes to the FIRST set of the derivation other production must not have any common members with the FOLLOW set of V. So for the above grammar Condition 1: Since X -- > xX | y FIRST (xX) = {x} FIRST (y) = {y} (condition satisfied) Condition 2: Since there are no epsilon productions in this grammar, this condition is also satisfied. This grammar *is* LL(1). (b) | x | y | ===*===========*===========| S | S -- > XX | S --> XX | ---+-----------+-----------| X | X -- > xX | X --> y | (c) Start with S in the stack and the input string in the input. We then look at the leftmost symbol in the stack and compare it with the leftmost symbol in the input. If both are the same terminal symbol we remove it from the stack and the input. If the stack contains a variable (on the leftmost side), we use the table above to figure out what production is needed, and derive the variable in the stack. This process repeats until there is nothing left on both the stack and the input -- this is the accepting state. A syntax error occurs when (1) leftmost terminal symbols do not match, (e.g. Stack: xX$, Input: yX$) (2) the stack is empty and the input string still has input, or vice versa, (e.g. Stack: $, Input: xxy$, or Stack: XX$, Input: $) (3) there are no productions in the table for a pair of leftmost symbols. (e.g. Stack X$, Input: z$ (see table)) Stack Input Production ===== ===== ========== S$ xyy$ S -- > XX XX$ xyy$ X -- > xX xXX$ xyy$ XX$ yy$ X -- > y yX$ yy$ X$ y$ X -- > y y$ y$ $ $ This input string is accepted... (d) S | |----------\ | | X X | | |----\ | | | | x X y | | | y