Classes

Production
Grammar
AbstractSyntaxTree
LL1Parser
LRItem
LRAutomatonState
LRParser
LR0Parser
SLR1Parser
LR1Parser
LALR1Parser

Production

Represents a single production

Member	Type	Comment
`lhs`	`str`	LHS of the production
`rhs`	`list`	RHS of the production stored as a list

The class also implements functions for pretty printing and checking equality of 2 productions

File : grammar.py

Grammar

Represents the grammar and stores key information related to it

Member	Type	Comment
`prods`	`list`	List of all productions (each element is an instance of `Production` class). This list also contains the augmented production $S' \rightarrow S$$ at index 0
`terminals`	`list`	List of all terminal symbols in the grammar
`nonterminals`	`dict`	Dictionary with every nonterminal in the grammar as keys. For every nonterminal $X$, `first` = $FIRST(X)$, `follow` = $FOLLOW(X)$, `prods_lhs` = list of productions in which $X$ appears on LHS, `prods_rhs` = list of productions where $X$ appears on RHS
`build_first`	`function`	Function that builds $FIRST(X)$ for every nonterminal $X$ in grammar
`build_follow`	`function`	Function that builds $FOLLOW(X)$ for every nonterminal $X$ in the grammar

File : grammar.py

AbstractSyntaxTree

A simple $n$-ary class implementation with extra fields for parsing and visualization

Member	Type	Comment
`root`	`str`	Root of the tree
`desc`	`list`	List of descendant trees
`prod_id`	`int`	What production ID does this AST correspond to ?
`node_id`	`int`	Graphviz node ID corresponding to root

File : abstractsyntaxtree.py

LL1Parser

Represents LL(1) parser. Note that currently, the parser can detect only the obvious left recursion (such as $E \rightarrow E + T$) but not implied (such as $A \rightarrow B C$, $B \rightarrow A C$) so please make sure that the grammar used for construction of LL1Parser is valid LL(1)

Member	Type	Comment
`grammar`	`Grammar`	Instance of grammar for which the parser is to be built
`parsing_table`	`pandas.DataFrame`	Parsing table for the parser. Each cell in this dataframe will contain either a special value (`YACV_ERROR`/`YACV_ACCEPT`) or list of actions
`is_ll1`	`bool`	Boolean which tells whether the grammar is a valid LL(1) grammar or not. This is checked after building the parsing table by looking for cells which have more than one actions in them
`build_parsing_table`	`function`	Function that builds LL(1) parsing table using $FIRST$ and $FOLLOW$ sets. After the parsing table is built, it will also set/unset the `is_ll1` accordingly
`parse`	`function`	Takes in a string (list of tokens) and attempts to parse it using the LL(1) parsing table. The function will raise appropriate errors if it fails to parse the string. On successful parsing, the resultant tree will be returned as an `AbstractSyntaxTree` instance
`visualize_syntaxtree`	`function`	Takes in a string (list of tokens) and attempts to visualize the syntax tree generated after parsing. This function relies on `parse` function to parse the string first. If the parsing is successful, the function will convert the generated `AbstractSyntaxTree` into a Graphviz graph and return it

File : ll1.py

LRItem

Represents a single LR item

Member	Type	Comment
`production`	`Production`	Production corresponding to this item
`dot_pos`	`int`	Position of the dot wrt production. `dot_pos=0` corresponds to $A \rightarrow • B$
`reduce`	`bool`	Is this item a reduce item ?
`update_reduce`	`function`	Update `reduce` based on `dot_pos` and `production`

File : lr.py

LRAutomatonState

Represents a single state in LR automaton

Member	Type	Comment
`items`	`list`	List of `LRItem`s
`preferred_action`	`str`	Preferred action (`'s'` or `'r'`) in case of conflict. Default, `'s'` = SHIFT
`shift_items`	`list`	List of items for which next action will be SHIFT
`reduce_items`	`list`	List of items for which next action will be REDUCE
`accept`	`bool`	Is the current state an accepting state ? Default = `False`
`conflict`	`bool`	Does this state have a conflict ?
`sr`	`bool`	Does this state have a SHIFT-REDUCE conflict ?
`rr`	`bool`	Does this state have a REDUCE-REDUCE conflict ?
`update_shift_reduce_items`	`function`	Bifurcate `items` into `shift_items` and `reduce_items`
`update_conflicts`	`function`	Update `conflict`, `sr` and `rr` based on `shift_items` and `reduce_items`. Do note that this function is not a litmus test for whether the grammar is valid LR grammar or not. Different parsers will choose to resolve conflicts differently resulting in slightly different parsing tables. Always refer to parsing table for determining whether the given grammar is valid or not

File : lr.py

LRParser

Generic LR parsing class. This is high level implementation of parsing and visualization algorithms. All LR parsing algorithms will inherit this class. Do note that, LR(1) parsers may take significantly longer time for large/ambiguious grammars. It is recommended to use yacv for smaller grammars only.

Member	Type	Comment
`grammar`	`Grammar`	Grammar for the LR parser
`is_valid`	`bool`	Is the given grammar valid under chosen LR parsing algorithm ?
`automaton_states`	`list`	List of `LRAutomatonState`s which are part of LR automaton of this parser
`automaton_transitions`	`dict`	Dictionary describing state transitions for LR automaton
`automaton_built`	`bool`	Is the LR automaton ready for this parser ? Default = `False`
`parsing_table`	`pandas.DataFrame`	LR parsing table. Each cell in this dataframe will contain either a special value (`YACV_ERROR`/`YACV_ACCEPT`) or list of actions
`closure`	`function`	Takes in a single `LRItem` or list of `LRItem`s and returns their closure as list of `LRItem`s
`build_automaton_from_init`	`function`	Takes in the inital `LRAutomatonState` and builds the LR automaton from it. After this function is complete `automaton_states` and `automaton_transitions` will be populated properly
`parse`	`function`	Takes in a string (list of tokens) and attemps to parse it using the LR parsing table. The function will raise appropriate errors if it fails to parse the string. Do note that, because of the nature of LR parsing, these error messages may not be very intuitive. On successful parsing, an `AbstractSyntaxTree` corresponding to the parsed string will be returned
`visualize_syntaxtree`	`function`	Takes in a string (list of tokens) and attempts to visualize the syntax tree generated after parsing. If the parsing is successful the function will convert the generated `AbstractSyntaxTree` into a Graphviz graph and return it
`visualize_automaton`	`function`	Returns a Graphviz graph corresponding to the LR automaton for the parser

File : lr.py

LR0Parser

Inherits LRParser

LR(0) parser uses canonical set of LR(0) items. To create this set, build_automaton_from_init(x) is called where $x = closure([S' \rightarrow • S])$ where $S$ is the starting symbol of the grammar. From this set of items, the LR(0) parsing table is created by build_parsing_table()

File : lr.py

SLR1Parser

Inherits LR0Parser

SLR(1) parser uses canonical set of LR(0) items as well. The only difference between SLR(1) parser and LR(0) parser is the method for building the parsing table

File : lr.py

LR1Parser

Inherits LRParser

LR(1) parser or canonical LR(1) parser uses canonical set of LR(1) items. To create this set, build_automaton_from_init(x) is called where $x = closure([S' \rightarrow • S, \$])$ where $S$ is the starting symbol and lookahead $ is represented after “comma”. From this set of items, the LR(1) parsing table is created by build_parsing_table()

File : lr.py

LALR1Parser

Inherits LR1Parser

LALR(1) parser is implemented by first taking canonical set of LR(1) items and then compacting it by merging together states with common kernel.

File : lr.py

Classes

Table of contents

Production

Grammar

AbstractSyntaxTree

LL1Parser

LRItem

LRAutomatonState

LRParser

LR0Parser

SLR1Parser

LR1Parser

LALR1Parser