In this lecture we study two more parsing algorithms.
Predictive parsing is a top-down parsing approach in which the stack contains the symbols to parse. Actions consist in either popping a terminal off the stack when that coincides with the next input token, or in predicting a production for the non-terminal that is on top of the stack and replacing that non-terminal with the symbols on the right-hand side of the predicted production. We discuss how to derive an LL(1) automaton from a grammar based on the FIRST and FOLLOW set for that grammar.
Generalized LR parsing is a generalization of the LR parsing algorithm that we studied last week. The LR parsing algorithm is deterministic. At each point there is a single shift or reduce action that can take place. That means, that an LR parser cannot handle shift/reduce conflicts in the parse table. A Generalized-LR parser handles such conflicts by forking the stack and continuing to parse the input with two parsers in parallel. This may lead to multiple parsers to be active simultaneously. When two parsers end up in the same state again, their stacks are merged.
There are many more parsing algorithms than we can study in this course.
In the references a couple of pointers to other sources about parsing algorithms.
These include ALL(*)
, the algorithm that is at the basis of the popular ANTLR parser generator,
parsing expression grammars (PEGs) another popular parsing approach,
and parsing combinators in Haskell and Scala.
The PDF of the slides of the lecture with builds.