Link Search Menu Expand Document

Classes

Table of contents

  1. Production
  2. Grammar
  3. AbstractSyntaxTree
  4. LL1Parser
  5. LRItem
  6. LRAutomatonState
  7. LRParser
  8. LR0Parser
  9. SLR1Parser
  10. LR1Parser
  11. LALR1Parser

Production

Represents a single production

Member Type Comment
lhs str LHS of the production
rhs list RHS of the production stored as a list

The class also implements functions for pretty printing and checking equality of 2 productions

File : grammar.py

Grammar

Represents the grammar and stores key information related to it

Member Type Comment
prods list List of all productions (each element is an instance of Production class). This list also contains the augmented production \(S' \rightarrow S\)$ at index 0
terminals list List of all terminal symbols in the grammar
nonterminals dict Dictionary with every nonterminal in the grammar as keys. For every nonterminal \(X\), first = \(FIRST(X)\), follow = \(FOLLOW(X)\), prods_lhs = list of productions in which \(X\) appears on LHS, prods_rhs = list of productions where \(X\) appears on RHS
build_first function Function that builds \(FIRST(X)\) for every nonterminal \(X\) in grammar
build_follow function Function that builds \(FOLLOW(X)\) for every nonterminal \(X\) in the grammar

File : grammar.py

AbstractSyntaxTree

A simple \(n\)-ary class implementation with extra fields for parsing and visualization

Member Type Comment
root str Root of the tree
desc list List of descendant trees
prod_id int What production ID does this AST correspond to ?
node_id int Graphviz node ID corresponding to root

File : abstractsyntaxtree.py

LL1Parser

Represents LL(1) parser. Note that currently, the parser can detect only the obvious left recursion (such as \(E \rightarrow E + T\)) but not implied (such as \(A \rightarrow B C\), \(B \rightarrow A C\)) so please make sure that the grammar used for construction of LL1Parser is valid LL(1)

Member Type Comment
grammar Grammar Instance of grammar for which the parser is to be built
parsing_table pandas.DataFrame Parsing table for the parser. Each cell in this dataframe will contain either a special value (YACV_ERROR/YACV_ACCEPT) or list of actions
is_ll1 bool Boolean which tells whether the grammar is a valid LL(1) grammar or not. This is checked after building the parsing table by looking for cells which have more than one actions in them
build_parsing_table function Function that builds LL(1) parsing table using \(FIRST\) and \(FOLLOW\) sets. After the parsing table is built, it will also set/unset the is_ll1 accordingly
parse function Takes in a string (list of tokens) and attempts to parse it using the LL(1) parsing table. The function will raise appropriate errors if it fails to parse the string. On successful parsing, the resultant tree will be returned as an AbstractSyntaxTree instance
visualize_syntaxtree function Takes in a string (list of tokens) and attempts to visualize the syntax tree generated after parsing. This function relies on parse function to parse the string first. If the parsing is successful, the function will convert the generated AbstractSyntaxTree into a Graphviz graph and return it

File : ll1.py

LRItem

Represents a single LR item

Member Type Comment
production Production Production corresponding to this item
dot_pos int Position of the dot wrt production. dot_pos=0 corresponds to \(A \rightarrow • B\)
reduce bool Is this item a reduce item ?
update_reduce function Update reduce based on dot_pos and production

File : lr.py

LRAutomatonState

Represents a single state in LR automaton

Member Type Comment
items list List of LRItems
preferred_action str Preferred action ('s' or 'r') in case of conflict. Default, 's' = SHIFT
shift_items list List of items for which next action will be SHIFT
reduce_items list List of items for which next action will be REDUCE
accept bool Is the current state an accepting state ? Default = False
conflict bool Does this state have a conflict ?
sr bool Does this state have a SHIFT-REDUCE conflict ?
rr bool Does this state have a REDUCE-REDUCE conflict ?
update_shift_reduce_items function Bifurcate items into shift_items and reduce_items
update_conflicts function Update conflict, sr and rr based on shift_items and reduce_items. Do note that this function is not a litmus test for whether the grammar is valid LR grammar or not. Different parsers will choose to resolve conflicts differently resulting in slightly different parsing tables. Always refer to parsing table for determining whether the given grammar is valid or not

File : lr.py

LRParser

Generic LR parsing class. This is high level implementation of parsing and visualization algorithms. All LR parsing algorithms will inherit this class. Do note that, LR(1) parsers may take significantly longer time for large/ambiguious grammars. It is recommended to use yacv for smaller grammars only.

Member Type Comment
grammar Grammar Grammar for the LR parser
is_valid bool Is the given grammar valid under chosen LR parsing algorithm ?
automaton_states list List of LRAutomatonStates which are part of LR automaton of this parser
automaton_transitions dict Dictionary describing state transitions for LR automaton
automaton_built bool Is the LR automaton ready for this parser ? Default = False
parsing_table pandas.DataFrame LR parsing table. Each cell in this dataframe will contain either a special value (YACV_ERROR/YACV_ACCEPT) or list of actions
closure function Takes in a single LRItem or list of LRItems and returns their closure as list of LRItems
build_automaton_from_init function Takes in the inital LRAutomatonState and builds the LR automaton from it. After this function is complete automaton_states and automaton_transitions will be populated properly
parse function Takes in a string (list of tokens) and attemps to parse it using the LR parsing table. The function will raise appropriate errors if it fails to parse the string. Do note that, because of the nature of LR parsing, these error messages may not be very intuitive. On successful parsing, an AbstractSyntaxTree corresponding to the parsed string will be returned
visualize_syntaxtree function Takes in a string (list of tokens) and attempts to visualize the syntax tree generated after parsing. If the parsing is successful the function will convert the generated AbstractSyntaxTree into a Graphviz graph and return it
visualize_automaton function Returns a Graphviz graph corresponding to the LR automaton for the parser

File : lr.py

LR0Parser

Inherits LRParser

LR(0) parser uses canonical set of LR(0) items. To create this set, build_automaton_from_init(x) is called where \(x = closure([S' \rightarrow • S])\) where \(S\) is the starting symbol of the grammar. From this set of items, the LR(0) parsing table is created by build_parsing_table()

File : lr.py

SLR1Parser

Inherits LR0Parser

SLR(1) parser uses canonical set of LR(0) items as well. The only difference between SLR(1) parser and LR(0) parser is the method for building the parsing table

File : lr.py

LR1Parser

Inherits LRParser

LR(1) parser or canonical LR(1) parser uses canonical set of LR(1) items. To create this set, build_automaton_from_init(x) is called where \(x = closure([S' \rightarrow • S, \$])\) where \(S\) is the starting symbol and lookahead $ is represented after “comma”. From this set of items, the LR(1) parsing table is created by build_parsing_table()

File : lr.py

LALR1Parser

Inherits LR1Parser

LALR(1) parser is implemented by first taking canonical set of LR(1) items and then compacting it by merging together states with common kernel.

File : lr.py


Copyright © 2021 Ashutosh Sathe. Distributed by MIT license.