-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the docs #254
Improve the docs #254
Conversation
nearley can run in browsers too.
Run `doctoc`. The ToC got outdated.
Use consistent style: `#` headers, no extra newlines, `-` lists. I hope this is the rigth style, this is how the majority of the file looked like.
I apologize in advance, this will require at least one of @Hardmath123 and @tjvr to go through each commit and review the changes. Things to look for:
The whole PR can be squashed before merging, so expect small focused commits and force pushes (I'll tell if there're any sneaky changes). |
Notes so far:
|
@Hardmath123 This would streamline the docs, make them easier to digest all at once and not be as intimidating. |
Thank you for working on this! :D This is definitely something that needs doing, and we keep not getting round to. We'll definitely need to review this carefully, as you said. I suspect @Hardmath123 will have strong opinions :-)
I'm in favour. |
These two topics are covered in the section "Using a parser", which doesn't make much sense. The basic usage is already at the top of readme, so let's remove "Using a parser" completely.
16fbd03 adds a snippet for compiling grammars on client-side. Do you think it's a good idea to add this feature directly to nearley instead? In a separate PR, later. |
…not sure how I should feel that you said that; especially since you're probably right…
I propose something more radical: currently, the README is a behemoth of a tutorial/documentation/README crossover, while the webpage is what convinces you to use nearley. I think they should be swapped. The README should contain most of the contents of index.html (and some other stuff). /www should contain the docs, split up into chunks that make sense:
There's probably a way to coax github into rendering those markdown documents into a nice self-contained webpage, perhaps with Jekyll or something. |
Yes, please. We can add a note saying the
See above. Shouldn't be hard to revert. |
What is the use-case? I can't think of anything besides the "try nearley" page. |
At some point we should probably rework the Nearley API, but that's certainly a v3 thing. |
That's a great thought! Still, this PR is already a beast, and not in a good sense, so let's cut the scope, finish what's originally in the top comment and just merge it already. This PR is not my smartest idea. |
README.md
Outdated
``` | ||
|
||
Alternatively, to use a generated grammar in a browser runtime, include the | ||
`nearley.js` file in a `<script>` tag. | ||
Add a script to `scripts` in `package.json` that runs the command above if you only have a locally installed copy of nearley. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really substantial piece of work! :D Thanks so much @deltaidea for taking this on. I've had a read through your changes to the README and made some (mostly minor) comments. Thanks again!
README.md
Outdated
random strings that match your grammar. | ||
- `nearley-railroad` generates pretty railroad diagrams from your parser. This | ||
is mainly helpful for creating documentation, as (for example) on json.org. | ||
See [Using in frontend](docs/using-in-frontend.md) for instructions on how to use nearley in your browser code. | ||
|
||
You can uninstall the nearley compiler using `npm uninstall -g nearley`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drop this line.
README.md
Outdated
random strings that match your grammar. | ||
- `nearley-railroad` generates pretty railroad diagrams from your parser. This | ||
is mainly helpful for creating documentation, as (for example) on json.org. | ||
See [Using in frontend](docs/using-in-frontend.md) for instructions on how to use nearley in your browser code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good link! But let's move this sentence to the bottom of usage, and make clear that this is an unusual thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, to clarify, using a generated parser and nearley.js is a totally reasonable and common thing to do in the browser. Instructions on doing this should be prominent.
Using the nearley compiler in the browser is whacky and shouldn't be mentioned here.
README.md
Outdated
classes at universities, as well as [file format parsers](https://github.com/raymond-h/node-dmi), | ||
[markup languages](https://github.com/bobbybee/uPresent) and | ||
[complete programming languages](https://github.com/bobbybee/carbon). | ||
It's an npm [staff pick](https://www.npmjs.com/package/npm-collection-staff-picks). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hardmath123 I think this should be a list. Looks better that way. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also should mention a better set of projects. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, I think the markup language should be Idyll and the complete language should be https://github.com/sizigi/lp5562 or something.
README.md
Outdated
|
||
- A *terminal* is a string or a token. E.g. keyword `"if"` is a terminal. | ||
- A *nonterminal* is a combination of terminals and other nonterminals. E.g. an if statement defined as `"if" condition statement` is a nonteminal. | ||
- A *rule* (or production rule) is a definition of a nonterminal. E.g. `"if" condition statement` is the rule according to which the if statement nonterminal is parsed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hardmath123 Explaining what a CFG rule looks like is hard, but I think it's worth making sure we get this right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Embarrassing story: I started a glossary a while back, and accidentally committed it at some point.
https://github.com/Hardmath123/nearley/blob/master/glossary.md
Some things in it might be wrong, so we should probably remove it at some point. But, it's a good starting-point if we want to, as Tim says, "make sure we get this right".
README.md
Outdated
want to give fancy error messages for runtime errors. | ||
- `data: Array` - array with the parsed parts of the rule. It will always be an array, even if there's only one part. If a rule contains nonterminals with their own postprocessors, the respective parts will already be transformed. | ||
- `location: number` - the index (zero-based) at which the rule match starts. It's useful to retain this information in the syntax tree if you're writing an interpreter. | ||
- `reject: Object` - return this object to signal that this rule doesn't actually match. This allows you to restrict or conditionally support language features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably mention this
here. (cf. #258 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we should probably warn people to avoid reject
if possible. @Hardmath123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, "restrict or conditionally support language features" isn't the use-case here. The use-case is edge conditions like "I want [a-z]+ to match variables, EXCEPT for the keyword if
…"
Many (but not all) of these get fixed by using a lexer.
README.md
Outdated
```js | ||
var grammar = require("generated-code.js"); | ||
var nearley = require("nearley"); | ||
You can still use raw strings, but they will only match full tokens parsed by Moo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: this is (or was) actually a misfeature, but, hey, this isn't the place to discuss that. (Using raw strings is really convenient, but it interacted badly with moo's support for capture groups. Fortunately, we've removed those from Moo! :-) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...except we added back value transforms. I should really fix that...
The `nearley.Parser` constructor takes an optional third parameter, `options`, | ||
which is an object with the following possible keys: | ||
- [Best practices for writing grammars](docs/how-to-grammar-good.md) | ||
- [Custom tokens and lexers, parsing arbitrary arrays instead of strings](docs/custom-tokens-and-lexers.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we missing the feed(array)
bit in the linked file? @deltaidea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, no, we just never documented that properly in the first place.
README.md
Outdated
very large), so we recommend leaving this as `false` unless you are familiar | ||
with the Earley parsing algorithm and are planning to do something exciting | ||
with the parse table. | ||
## Recipies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo :-)
README.md
Outdated
@@ -410,27 +407,22 @@ parser. | |||
|
|||
This was previously called `bin/nearleythere.js` and written by Robin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hardmath123 Can we drop this sentence now? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I guess if you want.
README.md
Outdated
@@ -496,19 +484,16 @@ Webpack users can use | |||
[nearley-loader](https://github.com/kozily/nearley-loader) by Andrés Arana to | |||
load grammars directly. | |||
|
|||
## Still confused? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should probably move up nearer to Recipes. @deltaidea @Hardmath123
written in *itself* (this is called bootstrapping). | ||
## Introduction | ||
|
||
nearley compiles grammar definitions from a simple syntax resembling [BNF](https://en.wikipedia.org/wiki/Backus–Naur_form) to a JS representation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduction should also describe why people like nearley: it's user-friendly, easy to get started, works on both the browser and node, etc.
I can take care of this change later though, don't worry about it.
nearley compiles grammar definitions from a simple syntax resembling [BNF](https://en.wikipedia.org/wiki/Backus–Naur_form) to a JS representation. | ||
You pass that representation to the nearley's tiny runtime, feed it data, and get the results. | ||
|
||
nearley uses the Earley parsing algorithm with Joop Leo's optimizations to parse complex data structures easily. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "parse complex data structures" is not quite what we wanted to say, even in the original README. This is a good opportunity to find better wording. Complex languages, perhaps?
README.md
Outdated
|
||
To use nearley, you need both a *global* and a *local* installation. The two | ||
types of installations are described separately below. | ||
nearley is published as an [NPM](https://docs.npmjs.com/getting-started/what-is-npm) package compatible with [Node.js](https://nodejs.org/en/) and browsers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*Most browsers! Even some old ones. :-)
README.md
Outdated
> var nearley = require("nearley"); | ||
> var grammar = require("./my-generated-grammar.js"); | ||
|
||
Check out the wonderful [nearley playground](https://omrelli.ug/nearley-playground/) to explore nearley interactively in your browser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be moved to a more prominent location? "NOTE: You can follow along by using the nearley playground, an online interface for exploring nearley grammars in your browser."
README.md
Outdated
``` | ||
|
||
See below for detailed API and grammar syntax specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's replace "below" with internal links to sections.
README.md
Outdated
|
||
will match 0 or more `cow`s in a row. | ||
If you would like to support a different language, feel free to file a PR! | ||
|
||
### Charsets | ||
|
||
You can use valid RegExp charsets in a rule: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*in scannerless mode
- `keepHistory` (boolean, default `false`) - whether to preserve and expose the internal state | ||
- `lexer` (object) - custom lexer, overrides `@lexer` in the grammar | ||
|
||
If you are familiar with the Earley parsing algorithm and are planning to do something exciting with the parse table, set `keepHistory`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't the initial motivation for keepHistory
rewinding? @tjvr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we now have save()
for that purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: document save/restore.
@@ -0,0 +1,37 @@ | |||
# Using nearley in browsers | |||
|
|||
Use a tool like [Webpack](https://webpack.js.org/) or [Rollup](https://rollupjs.org/) to include the `nearley` NPM package in your browser code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, doesn't nearley work fine if you just, y'know, include nearley.js?
Let's be very careful making the distinction between parsing with nearley and compiling with nearleyc. This file should be called compiling nearley grammars in browsers
or something!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don’t see a need to mention bundlers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I thought we were still in README! This is fine as-is I think. (Ignoring the fact that I personally prefer browserify :-) )
I made a rough pass too. |
Thanks for the insight, guys! This is really helpful for my understanding. I'm very busy at the moment. Maybe I'll have some time on the weekend. Feel free to fix stuff on your own if you like. |
I'm confused what you mean by this. Should I merge your PR to |
I mean both of you can just checkout this PR and push to it directly since the work is slow and there're no conflicts. git fetch origin pull/254/head:improve-docs
git checkout improve-docs |
this is a test commit. Can I commit to a PR? That seems odd.
@deltaidea Oh, weird! I’m confused about why I have push access to your fork, but there we go. :-) I fixed a bunch of my own review comments. @Hardmath123 wanted to rewrite some bits too, I think. I agree we shouldn’t keep this open too much longer; this is already much improved from what we have! @deltaidea, at some point please could you:
For now, leaving Tokenizers as part of the main README is fine. Leaving the documentation as markdown files in |
Thanks for the feedback and help! Isn't Regarding everything else, I'll have a proper look in an hour. |
GitHub has a checkbox called "Allow edits from maintainers" under every PR. It opens write access to the branch. |
Ah, thanks! Never seen that before. Looks like you have already replaced |
@deltaidea's mega-PR. A squash of these commits: Add some lexer stuff to README readme: remove trailing spaces readme: change the description to "Simple parsing in JavaScript" nearley can run in browsers too. readme: update the table of contents Run `doctoc`. The ToC got outdated. readme: make Markdown style consistent Use consistent style: `#` headers, no extra newlines, `-` lists. I hope this is the rigth style, this is how the majority of the file looked like. readme: add an intro and usage guide for beginners readme: reorder and reword parser specification readme: make all example code more readable, update to ES6 docs: move `glossary.md` and `how-to-grammar-good.md` to `docs/` docs: move making REPL and accessing parse table into their own files These two topics are covered in the section "Using a parser", which doesn't make much sense. The basic usage is already at the top of readme, so let's remove "Using a parser" completely. docs: add a clear Moo example to readme, separate out custom lexers docs: add a link in readme to "How to grammar good" docs: add a complete example of usage in browsers docs: use `require()` instead of `import` readme: link to the indentation-aware lexer by nathan docs: add an example of generating CST and AST readme: move recipies to section "Recipies" Make clear using `nearleyc` in a browser is unusual. this is a test commit. Can I commit to a PR? That seems odd. Fix a bunch of my review comments :-) adds link to nearley-gulp in README Add doctoc script to package.json
This PR was getting scary, so I went ahead and merged it! :D Thanks so much for your help @deltaidea. I'm gonna open a couple new issues for the rest--we can continue work in smaller PRs. |
Where is the context aware example? |
Preview of the progress so far
This is a joint effort to make the docs more straight-forward and complete.
I hijacked
tjvr:ihatewriting
branch and started this PR based on it. @tjvr can still push to this PR since he's a collaborator, and I didn't want to deal with plain-text conflicts.Relevant:
Things to do (in a vaguely sensible order):
Grammar.fromCompiled()
properly. (How to parse with the API: docs outdated? #247, Document lexers and fromCompiled #248, json.ne example throwing errors - Issue with moo? #257){% id %}
is documented high enough that you can't miss it. (What does {% id %} do? #244)parser.results
. (parser.results is an empty list #253)nearley-test
. (parser.results is an empty list #253)doctoc --notitle README.md
before merging.