At any stage of the parse, we will have the following configuration:
Stack Input s0x1s1...xmsm aiai+1...$
where the s's are states, the x's are sequences of terminals or nonterminals, and the a's are input symbols. This is somewhat like a finite-state machine where the state on top (the right here) of the stack contains the "accumulation of information" about the parse until this point. We just have to look at the top of the stack and the symbol coming in to know what to do.
We can construct such a finite-state machine from the productions in the grammar where each state is a set of Items.
We create the table using the grammar for expressions above. The reader is asked in Exercise 2 to extend this to recognize a sequence of assignment statements.
States
The LR table is created by considering different "states" of a parse. Each state consists of a description of similar parse states. These similar parse states are denoted by marked productions called items.
An item is a production with a position marker, e.g.,
EE · + T
which indicates the state of the parse where we have seen a string derivable from E and are looking for a string derivable from + T.
Items are grouped and each group becomes a state which represents a condition of the parse. We will state the algorithm and then show how it can be applied to the grammar of Example 3.
Constructing States via Item groups
(1) IF S is the Start symbol, put S' · S into a Start State called State 0.
(2) Closure: IF A x · X
is in the state, THEN add X
·
to
the state for every product X
·
in the grammar.
(3) Look for an item of form A x · z
where z is a single terminal
or nonterminal and build a new state from A --> xz ·
( Include in the
new states all items with · z in the original state.)
(4) Continue until no new states can be created. (A state is new if it is not identical to an old state.)
Step 1
State 0
E' · E
Step 2
E · E fits the model A
x · X
, with x,
=
, and x = E.
E
T and E
E + T are productions whose left-hand sides are E;
thus E
· T and E
· E + T are added to State 0.
State 0
E' · E
E · E + T
E · T
Reapplying Step 2 to E · T adds
T · T * F
T · F
and reapplying Step 2 To T · F adds
F · (E)
F · Id
State 0 is thus:
State 0
E" · E
E · E + T
E · T
T · T * F
F · (E)
F · Id
Using Step 3, there are two items in State 0 with an E after the dot. E'
· E fits the model A
x · z
with x and
=
, z = E. Thus, we
build a new state, putting E'
E · into it. Since E
· E + T
also has E after ·, we add E
E · + T. Step 2 doesn't apply, and we
are finished with State 1.
E' E ·
E E · + T
Continuing, the following states are created.
State 2 State 3 State 4 State 5 State 6 EThese are called LR(0) items because no lookahead was considered when creating them.T · T
F · F
(· E) F
Id · E
E + · T T
T · * F E
· E + T T
· T * F E
· T T
· F T
· T * F F
· (E) T
· F F
· Id F
· (E) F
· Id State 7 State 8 State 9 State 10 State 11 T
T * · F F
( E · ) E
E + T · T
T * F · F
(E) · F
· (E) E
E · + T T
T · * F F
· Id
We could use these to build an LR(0) parsing table, but for this example, there will be multiply defined entries since the grammar is not LR(0) see (Exercise 10) These can also be considered SLR items and we will use them to build an SLR(1) table, using one symbol lookahead.
Construction of an SLR(1) Parsing Table
(2) IF A
· is in state n, THEN enter Ri at Table [n,a] WHERE i is
the production i : A
and a is in FOLLOW (A).
(3) IF S' S · is in State n, THEN enter "Accept" at Table [n, $].
(4) IF A x · B
is in State m, AND A
x B ·
is in State n,
THEN enter n at Table [m, B].
Eso Table[1,+] =S6E · + T is in State 1 E
E + · T is in State 6
Step 2
In State 3 we have T F ·
The terminals that can follow T in any sentential form are +, *, ), and $ (see Chapter 4)
So Table[3,+] = Table[3,*] = Table[3,)] = Table[3,$] = R4 where 4 is the
number of production T F.
Step 4
E · E + T is in State 0, while E
E · + T is in State 1, so
Table[0,E]=1