-
Notifications
You must be signed in to change notification settings - Fork 0
vtf txb display syntaxtree
Basic objects for the design of an abstract syntax tree (AST). These are derived in highlightertree to create the highlighters.
class syntaxtree.SyntaxTree ¶
Parser base object for designing an abstract syntax tree.
The branches of the syntax tree are defined by SyntaxBranch objects and are attached to the main
root
branch; the start/node, termination, and leaves of a branch (and root branch) are created as a RegularExpression -- SyntaxLeaf-factory pair.Leaves defined in
globals
(SyntaxGlobals) apply independently of currently active branches.from re import compile # _ _ _ _ _ _ _ _ # | | # B - l - l - g - l ... E # / # R - l - g - l - l ... i ast = SyntaxTree() B = SyntaxBranch(node_pattern=compile("\\("), stop_pattern=compile("\\)")) B.adopt_self() B.add_leaf(compile("regex"), lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart)) ... ast.root.add_branch(B) ast.globals.add(compile("regex"), lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart))from re import compile # _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ # | | # B1 - l - l - l - E | # / / # R - l - l - l - l - l - l ... i B3 - l - l - l - l - l - E # \ / | # B2 - l - l - l ... E | # \ _ _ _ _ _ _ _ _ _ _ _ _ _ _ | ast = SyntaxTree() B1 = SyntaxBranch(node_pattern=compile('"'), stop_pattern=compile('"')) B2 = SyntaxBranch(... B3 = SyntaxBranch(... ast.root.add_branch(B1) ast.root.add_branch(B2) B2.add_branch(B3) B3.add_branch(B1) B3.add_branch(B2) B*.add_leaf(... ...
When using the special characters of regular expressions that refer to the beginning or end of a string, such as
"^"
or"\\Z"
, it must be noted that the row is sliced during parsing. The following illustration sketches the parsing process.from re import compile square_bracket_branch = SyntaxBranch(node_pattern=compile("\\[node]"), stop_pattern=compile("\\[end]")) curly_bracket_branch = SyntaxBranch(node_pattern=compile("\\{node}"), stop_pattern=compile("\\{end}")) square_bracket_branch.add_branch(curly_bracket_branch) # node_leaf node_leaf end_leaf end_leaf # | | leafs | | leafs | | leafs | | # | |[ - - - -| |{ - - - - - }| |- - - - - - - - ]| | # | | | | | | | | "... [node] foo bar {node} foo bar ... {end} ... foo bar ... [end] ..." ... "[node]" # node found " foo bar {node} foo bar ... {end} ... foo bar ... [end]" # search for a sub-node "{node}" # sub-node found " foo bar " # applying Leave configurations to the remaining string " foo bar ... {end} ... foo bar ... [end]" # search for the end of a branch "{end}" # end of a branch found " foo bar ... " # apply leave configurations to the remaining string ...Applicable leaves, branch-node leaves, and ending leaves of a branch are appended to a passed list (as SyntaxLeaf objects) during parsing; active branches are passed within a list to the parsing process and expanded by it should another branch occur, or truncated should a branch end.
A
SyntaxLeaf
contains the matchedre.Match
object, the originSyntaxBranch
and the relative starting point in the row, since it is sliced during the pars process. The methods withtotal_*
prefix return the actual position in the row.Also, the
node
attribute is notNone
but the beginningSyntaxBranch
object if the leaf represents the beginning of a branch.The parse process is performed row by row using the methods with
map_*
prefix, or only a part solely to capture the branch bifurcations bybranch_grow
. The methodsmap_leafs
andbranch_grow
are interfaces to the actual methods which are realized as recursions, these underlying methods cannot be overwritten in inheritances.
@propertyglobals() -> SyntaxGlobals ¶The SyntaxGlobals.@propertyroot() -> SyntaxBranch ¶Theroot
- SyntaxBranchbranch_growing(string, has_end, _branches_) -> list[SyntaxBranch] ¶
Apply to a string only the SyntaxBranch configurations (skip parsing the leaves) and expand or shorten the list of _branches_.
Via has_end it is specified whether the string has a terminating end and is processed in connection with the multiline parameterization of the
SyntaxBranch
.The list of _branches_ represents the current sequence of active
SyntaxBranch
's; if it is empty, theroot
is the currentSyntaxBranch
.map_globals(string, _out_) -> list[SyntaxLeaf] ¶
Apply the leaves defined in theglobals
to a string, append the parsed leaves to the _out_ list as SyntaxLeaf objects.map_leafs(string, has_end, _branches_, _leaf_out_) -> tuple[list[SyntaxBranch], list[SyntaxLeaf]] ¶
Apply the entire configurations of the SyntaxBranch's and their SyntaxLeaf's to a string. Append the parsed leaves to the list _leaf_out_ and expand or shorten the list of active _branches_.
Via has_end it is specified whether the string has a terminating end and is processed in connection with the multiline parameterization of the
SyntaxBranch
.The list of _branches_ represents the current sequence of active
SyntaxBranch
's; if it is empty, theroot
is the currentSyntaxBranch
.map_tree(string, has_end, _branches_, _leaf_out_) -> tuple[list[SyntaxBranch], list[SyntaxLeaf]] ¶
Apply the entire configurations of the SyntaxBranch's and their SyntaxLeaf's to a string. Append the parsed leaves to the list _leaf_out_ and expand or shorten the list of active _branches_.
Then apply the leaves defined in
globals
to the string and append the parsed leaves asSyntaxLeaf
objects to the _leaf_out_ list.Via has_end it is specified whether the string has a terminating end and is processed in connection with the multiline parameterization of the
SyntaxBranch
.The list of _branches_ represents the current sequence of active
SyntaxBranch
's; if it is empty, theroot
is the currentSyntaxBranch
.purge_globals() -> SyntaxGlobals ¶
Reinitialize the current SyntaxGlobals.purge_root() -> SyntaxBranch ¶
Reinitialize the currentroot
- SyntaxBranch.set_globals(__new_globals) -> None ¶
Set the SyntaxGlobals.set_root(__new_root) -> None ¶
Set theroot
- SyntaxBranch.
class syntaxtree.SyntaxBranch ¶
Syntax branch object used by the SyntaxTree.
The beginning of a branch is defined by the node_pattern and the leaf is created by the node_leaf factory, which must return a SyntaxLeaf object with the
node
attribute set to the beginning branch.If the beginning of a branch is recognized by the parser methods in the AST, a definable activate function is executed, which must return a branch object.
By default, the same object is returned and appended to the sequence of active branches if the stop_pattern is a pattern;
if the stop_pattern is defined as an executable object, it receives the
SyntaxBranch
object and the node-SyntaxLeaf
on activation and must return apattern
that defines the end of the branch. Upon activation, a "deep copy" (snap
) of the branch object is then created and appended to the sequence of active branches, if activate wasNone
at creation.The terminating leaf object is then created by the factory stop_leaf when the pattern occurs.
The leaf factories receive the parent
SyntaxBranch
, the applicablepattern
, there.Match
and therelative start
of the sub-string when the node_pattern or stop_pattern occurs; in additionally, the beginningSyntaxBranch
object is passed to the node_leaf factory.The parameters multirow and multiline are evaluated in the parser methods of the AST. If the parameter multirow is set to
True
, after processing a single string the branch is NOT removed from the sequence of active branches if the string is not line ending. If the parameter multiline is set toTrue
, the branch will be kept in the sequence even over line endings.Via the parameterization label each object can be passed to identify the branch.
The attributes
__start_leaf__
and__parent_branch__
are only set by the AST when theactivate
-METHOD is executed, then thestop_pattern
is determined within the method and finally the PARAMETERIZED activate-FUNCTION is executed.The leaves of the branch are created as
pattern
--SyntaxLeaf
-factory pairs and further forks to branches within the branch are also defined asSyntaxBranch
objects.from re import compile ast = SyntaxTree() B1 = SyntaxBranch(node_pattern=compile("\\("), stop_pattern=compile("\\)"), multiline=True, label="numbers in tuple") B1.add_leaf(compile("\\d+"),) B1.add_leaf(compile(","), label="comma") B2 = SyntaxBranch(node_pattern=compile("#"), stop_pattern="$", multirow=False, label="comment") B2.add_leaf(compile(".+"), lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart)) B1.add_branch(B2) ast.root.add_branch(B1)Via methods with
adopt_*
prefix definitions of branches and/or leaves can be adopted from other branches.from re import compile # _ _ _ _ _ _ _ _ # | | # B - l - l - l - l ... E # / # R - l - l - l - l ... i ast = SyntaxTree() B = SyntaxBranch(... B.adopt_self() ... ast.root.add_branch(B)from re import compile # _ _ _ _ _ _ _[adopt_branches(root) **] _ _ _ _ _ _ _ _ _ _ _ # | _ _ [adopt_branches(B1) + adopt_leafs(B1)] _ _ _ _ _ | # | | | | # | B1 - l - l - l - E B3 - l - l - l - l - l - E | | # \ / \ / | | # / B2 - l - l - l ... E | | # / | | # R - l - l - l - l - l - l ... i B5 - l - l - l - l - l - E | | # /\ / | | # / B4 - l - l - l ... E | | # |_[**] _ _ / | \ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ | | # | _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ | ast = SyntaxTree() B1 = SyntaxBranch(... B2 = SyntaxBranch(... B3 = SyntaxBranch(... B4 = SyntaxBranch(... B5 = SyntaxBranch(... ast.root.add_branch(B1) B1.add_branch(B2) B2.add_branch(B3) ast.root.add_branch(B4) B4.add_branch(B5) B4.adopt_branches(ast.root) B4.adopt_branches(B1) B*.add_leaf(... B4.adopt_leafs(B1) ...__node_leaf__: SyntaxLeaf
__parent_branch__: SyntaxBranch
branches: tuple[SyntaxBranch]
label: Any
leafs: tuple[tuple[Pattern | str, Callable[ [SyntaxBranch, Pattern | str, Match, int], SyntaxLeaf], Any], ...]
multiline: bool
multirow: bool
node_leaf: Callable[ [SyntaxBranch, Pattern | str, Match, int, SyntaxBranch], SyntaxLeaf]
node_pattern: Pattern | str
stop_leaf: Callable[ [SyntaxBranch, Pattern | str, Match, int], SyntaxLeaf]
stop_pattern: Pattern | str | None
activate(node_leaf, parent) -> SyntaxBranch ¶
Set the
__node_leaf__
and__parent_branch__
attributes, poll thestop_pattern
and return the activated version of theSyntaxBranch
object.Executed inside the pars methods in the SyntaxTree and gets the node-SyntaxLeaf and the parent
SyntaxBranch
.add_branch(branch) -> None ¶
Add a fork to the branch.add_leaf(pattern, leaf=lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart), label=None) -> None ¶
Add a leaf of the branch as a
pattern
-- SyntaxLeaf-factory.from re import compile branch.add_leaf("regex",) branch.add_leaf(compile(","), label="comma") branch.add_leaf(compile(".+"), lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart))For later identification, each object can be used as a label.
adopt_branches(branch) -> None ¶
Add forks to the branch from another branch.adopt_leafs(branch_or_globals) -> None ¶
Add leaves from another branch to the branch.adopt_self() -> None ¶
Add to the branch itself for a recursion.branch_mapping(string, relstart, _out_) -> list[SyntaxLeaf] ¶
Apply each definition of nodes to branches to the "string". Append matches as SyntaxLeaf objects to the list _out_.
relstart specifies the start position of a substring.
This method is executed inside the parsing methods in the SyntaxTree.
leaf_mapping(string, relstart, _out_) -> list[SyntaxLeaf] ¶
Apply each branch leaf definition to the string. Append matches as SyntaxLeaf objects to the list _out_.
relstart specifies the start position of a substring.
This method is executed inside the parsing methods in the SyntaxTree.
poll_stop_pattern(node_leaf=None) -> None ¶
Poll the
stop_pattern
.Executed within the
activate
method and is only efficient if the stop_pattern is defined as an executable object.
raises:
- AttributeError:
node_leaf
is not passed and__node_leaf__
is not yet set in the object.remove_branches_by_attributes(deep=False, _or_=False, **attributes) -> None ¶
Remove branch ramifications with the applicable attributes [, to the deep of all branches and ramifications]. Remove when all attribute conditions are satisfied _or_ when only one attribute applies.remove_branches_by_label(label, deep=False) -> None ¶
Remove branch ramifications with label [, to the deep of all branches and ramifications].remove_leafs_by_label(label, deep=False) -> None ¶
RemoveSyntaxLeaf
definitions with label [, in the deep of all branches and ramifications].remove_leafs_by_pattern(pattern, deep=False) -> None ¶
RemoveSyntaxLeaf
definitions with pattern [, in the deep of all branches and ramifications].snap() -> SyntaxBranch ¶
Create a "deep copy" (snap) from the current attributes of theSyntaxBranch
. (Preservation should e.g. exist dependencies to thestop_pattern
).starts(string, relstart, parent) -> SyntaxLeaf | None ¶
Return a SyntaxLeaf when the branch starts in the string.
Executed inside the pars methods in the SyntaxTree and gets the relative starting point of a substring and the parent-
SyntaxBranch
.stops(string, relstart) -> SyntaxLeaf | None ¶
Return a SyntaxLeaf when the branch stops in the string.
Executed inside the pars methods in the SyntaxTree and gets the relative starting point of a substring.
class syntaxtree.SyntaxGlobals ¶
A container for globally defined SyntaxLeaf's.
The global leafs are created as RegularExpression -- SyntaxLeaf-factory pairs, additionally each object can be used as a label to remove definitions afterwards.
from re import compile ast = SyntaxTree() ast.globals.add(compile("bar"), label=Any) ast.globals.add(compile("foo"), label=object()) ... ast.globals.remove_by_label(Any)leafs: tuple[tuple[Pattern | str, Callable[ [SyntaxGlobals, Pattern | str, Match, int], SyntaxLeaf], Any], ...]
add(pattern, leaf=lambda parent, pattern, match, relstart: SyntaxLeaf(parent, match, relstart), label=None) -> None ¶
Add a SyntaxLeaf-rule. To the leaf-factory is passed on occurrence of a match on pattern; theSyntaxGlobals
object, the pattern, there.Match
and the relative start, the execution should return an SyntaxLeaf. Additionally, each object can be used as a label to remove definitions afterwards.mapping(string, relstart, _out_) -> list[SyntaxLeaf] ¶
Apply each definition of global leaves to the string. Append matches as SyntaxLeaf objects to the list _out_.
relstart specifies the start position of a substring.
This method is executed inside the parsing methods in the SyntaxTree.
remove_by_label(label) -> None ¶
Remove all definitions with label.remove_by_pattern(pattern) -> None ¶
Remove all definitions with pattern.
class syntaxtree.SyntaxLeaf ¶
The syntax leaf object is generated as the result of a parse by SyntaxTree and is used for further processing.
The object contains the
parent
SyntaxBranch or SyntaxGlobals, there.Match
, the relative start (relstart
) of a substring and, if the leaf represents the beginning of aSyntaxBranch
, the beginning branch undernode
.The priority of a leaf over others found in a string is realized with
__lt__
.The parameterization can be defined differently by an executable object, this executable object gets this
SyntaxLeaf
and the otherSyntaxLeaf
and must return a boolean value if thisSyntaxLeaf
has a higher priority than the other. IfTrue
is passed to priority, the earliest leaf has the highest priority, and if there is a tie the match with the largest span has priority. If the priority parameter isFalse
, the earliest leaf with the smallest span has priority.Since the string is sliced during the parsing process, the
start
/end
/span
methods of there.Match
object (also realized as properties in theSyntaxLeaf
) may not return the actual values with reference to the passed string; therefore, the values can be obtained considering the relative starting point via the properties withtotal_*
prefix.match: re.Match
node: SyntaxBranch | None
parent: SyntaxGlobals | SyntaxBranch
relstart: int
@propertyend() -> int ¶@propertyspan() -> tuple[int, int] ¶@propertystart() -> int ¶@propertytotal_end() -> int ¶@propertytotal_span() -> tuple[int, int] ¶@propertytotal_start() -> int ¶
Date: | 13 Dec 2022 |
---|---|
Version: | 0.1 |
Author: | Adrian Hoefflin [srccircumflex] |
Doc-Generator: | "pyiStructure-RSTGenerator" <prototype> |