Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing Markdown Lists #243

Closed
techpines opened this issue Jun 5, 2017 · 3 comments
Closed

Parsing Markdown Lists #243

techpines opened this issue Jun 5, 2017 · 3 comments
Labels

Comments

@techpines
Copy link

techpines commented Jun 5, 2017

I'm trying to parse markdown lists, where you create a list using * like so:

This is my test case:
* List Item 1
* List Item 2
* List Item 3
Match lists and random text

Here's my grammar, where I'd like to have those list lines grouped together unambiguously into one object:


@{%
const appendItem = (a, b) => { return (d) => { return d[a].concat([d[b]]); } };
const emptyStr = (d) => { return ""; };
%}

page     -> block
          | page block         {% appendItem(0, 1) %}

block    -> list
          | line

list     -> listItem      
          | list listItem      {% appendItem(0, 1) %}

listItem -> "\n*" string
      
line     -> "\n" string

string   -> null               {% emptyStr %}
          | string [^\n*]      {% appendItem(0, 1) %}

My problem is that this grammar is ambiguous because you can group the list lines * List Item a couple of different ways. I want all the list items grouped together, but I'm not sure how to force this and make my grammar unambiguous. Or maybe I'm just solving this problem completely wrong ;)

Slight side note, I assume the example text starts with a "\n" to make the grammar easier to parse.

Any help would be greatly appreciated, thanks!

@deltaidea
Copy link
Contributor

deltaidea commented Jun 5, 2017

Duplicate of #89. The problem is the same: prefer [a, a] to [a], [a]. Possible solutions are also the same.

  1. Reject [a], [a] later in the post-processor of the parent (page in this case):
@{%
const hasConsecutive = (arr, predicate) => arr.slice(0, -1).some((el, i) => predicate(el) && predicate(arr[i + 1]))
%}

page     -> block:+       {% ([blocks], _, reject) => hasConsecutive(blocks, b => b.type === "List") ? reject : blocks %}
block    -> list          {% id %}
          | line          {% id %}
list     -> listItem:+    {% ([items]) => ({ type: "List", items }) %}
listItem -> "\n*" string  {% ([, str]) => str %}
line     -> "\n" string   {% ([, str]) => ({ type: "Line", value: str }) %}
string   -> [^\n*]:*      {% ([chars]) => chars.join("") %}
  1. Define the grammar in a way that doesn't allow two consecutive lists:
page     -> (list line | line):* list:? {% ([pairs, last]) => [].concat(...pairs, last || []) %}
list     -> listItem:+                  {% ([items]) => ({ type: "List", items }) %}
listItem -> "\n*" string                {% ([, str]) => str %}
line     -> "\n" string                 {% ([, str]) => ({ type: "Line", value: str }) %}
string   -> [^\n*]:*                    {% ([chars]) => chars.join("") %}

In this example, each list must be followed by a line, so it can't be followed by another list.

@techpines
Copy link
Author

@deltaidea Thanks for the reply, your example code is not only correct but also really informative!

Using reject seems like the most robust solution, since my actual problem is more complicated and has potentially many more types of blocks.

Also, you linked to #89 but @introrse said "I modified nearley.js" to get his example working, so maybe others should look here for how to solve (unless there is a difference between the two problems that I'm not getting)

@kach kach added the question label Jun 5, 2017
@kach
Copy link
Owner

kach commented Jun 5, 2017

Thanks for helping out with this, @deltaidea. And glad we could solve your problem, @techpines!

@deltaidea deltaidea mentioned this issue Jul 30, 2017
18 tasks
@tjvr tjvr mentioned this issue Aug 9, 2017
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants