Skip to content
This repository was archived by the owner on Aug 17, 2023. It is now read-only.

Proper element and filter parsing #40

Merged
merged 15 commits into from
Dec 8, 2016
Merged

Conversation

rowanseymour
Copy link
Member

@rowanseymour rowanseymour commented Dec 7, 2016

Currently the node parser works on a line by line basis which leads to following issues:

  • To determine if an element node spans multiple lines, it counts the number of { and } characters. You can break things by putting a brace into an attribute value. People don't often use { characters not in pairs, but now that we also support (...) style attribute dictionaries, chances of things breaking are a bit higher.
  • Filters are parsed as a single line node, with each line of nested content being parsed like any other node. This leads to inefficiency as something like :plain\n %abc, causes the %abc line to be parsed as an element even tho it's just a line of plain text

This PR reworks the parsing of nodes so that:

  • Multiline elements are parsed by a proper element parser which reads lines until it has a complete element
  • Filter nodes are parsed with all nested lines as their text content, rather than as child nodes.
  • Filters become simple functions in their own module, and leverage textwrap.dedent for removing indentation.

Performance-wise new code seems to be about ~40% faster.

Fixes #38, #39 and #41


if not root.parent_of(HamlNode(line, self)).inside_filter_node():
if line.count('{') - line.count('}') == 1:
start_multiline = line_number # for exception handling
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking line numbers would be useful but this line doesn't actually do anything

node_lines = line

if not root.parent_of(HamlNode(line, self)).inside_filter_node():
if line.count('{') - line.count('}') == 1:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This where things break if you have {} characters in attribute values

@@ -144,11 +130,17 @@ def read_attribute_dict(stream):
"""
data = OrderedDict()

start, terminator = stream.text[0], stream.text[-1]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this code assumed the stream only contained an attribute dictionary - now it contains the entire haml document

@coveralls
Copy link

coveralls commented Dec 7, 2016

Coverage Status

Coverage decreased (-0.1%) to 99.885% when pulling aabb4b0 on multiline_elements into 39c38ac on master.



def stylus(content, indent, options):
return indent + '<style type=%(attr_wrapper)stext/stylus%(attr_wrapper)s>\n' \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some inconsistencies with now indentation is handled by different filters. I've tried to preserve some of those inconsistencies for now so that the template tests work as is


start.add_node(one)
start.add_node(two)
start.add_node(three)

self.assertEqual(3, len(start.children))

def test_node_parent_function(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent_of was being used to figure out of a node belonged to filter node that no longer happens and this is no longer needed

@coveralls
Copy link

coveralls commented Dec 7, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 928dc83 on multiline_elements into 39c38ac on master.

@coveralls
Copy link

coveralls commented Dec 7, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 52a35fd on multiline_elements into 39c38ac on master.

@@ -24,3 +24,4 @@
3
4
5

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of a few cases where we don't match the original output exactly because of an additional new line. IMO this new line is correct (it's coming from the python print) and shouldn't break anyone's stuff.

…or empty lines in attribute alues which are Haml
@@ -97,6 +93,7 @@ def test_parse(self):
'class':
- if forloop.first
link-first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blank line tests #41

@coveralls
Copy link

coveralls commented Dec 7, 2016

Coverage Status

Coverage decreased (-0.1%) to 99.885% when pulling bc7ced0 on multiline_elements into 39c38ac on master.

@coveralls
Copy link

coveralls commented Dec 7, 2016

Coverage Status

Coverage remained the same at 100.0% when pulling 5e00da1 on multiline_elements into 39c38ac on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 5e00da1 on multiline_elements into 39c38ac on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 5e00da1 on multiline_elements into 39c38ac on master.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants