Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New parser based on new version of the Coq backend of Menhir #276

Merged
merged 3 commits into from
Jul 5, 2019

Conversation

jhjourdan
Copy link
Contributor

@jhjourdan jhjourdan commented Feb 26, 2019

This is the new version of the parser, based on recent updates of Menhir.

The corresponding changes in Menhir are not yet released: to test these changes, you should install the current master branch of Menhir. This is the reason why this PR is marked WIP. But apart from that (and updating configure), I think this PR is ready.

Changes include:

  1. A rewrite of the Coq interpreter of Menhir automaton, with dependent types removing the need for runtime checks for the well-formedness of the LR stack. This seem to cause some speedup on the parsing time (~10% for lexing + parsing).

  2. Thanks to 1., I was able to no longer use int31 for comparing symbols: Since this is only used for validation, positives are enough.

  3. Speedup of Validation: on my machine, the time needed for compiling Parser.v goes from about 2 minutes to about 1 minute. I am not sure of the actual reason for this, but this seem to be related to a performance bug I detected in the completeness validator and to the use of positive instead of int31 (I used to use a weird hack for representing int31 constants for making sure they were evaluated efficiently after extraction... but this could actually be rather slow when executed with vm_compute).

  4. Menhir now generates a dedicated inductive type for (semantic-value-carrying) tokens (in addition to the already existing inductive type for (non-semantic-value-carrying) terminals. The end result is that the OCaml support code for the parser no longer contain calls to Obj.magic. The bad side of this change is that the formal specification of the parser is perhaps harder to read.

  5. The parser and its library are now free of axioms (I used to use axiom K and proof irrelevance for easing proofs involving dependent types).

@jhjourdan
Copy link
Contributor Author

Note that I also updated the LICENSE file, reflecting that coq-menhirlib is under LGPLv3 (like Flocq), and that I moved it to the root of the repository.

@xavierleroy
Copy link
Contributor

What about the use of streams? Cf. #231 . Will these changes to Menhir and its supporting library remove dependencies on the Stream library from Coq?

@jhjourdan
Copy link
Contributor Author

You are right, I forgot about this. I need a bit more work to get rid of the Stream dependency.

@jhjourdan
Copy link
Contributor Author

What about the use of streams? Cf. #231 . Will these changes to Menhir and its supporting library remove dependencies on the Stream library from Coq?

The dependency to the Stream library of Coq is now removed, Menhir now uses its own type for input buffer. This type is a negative coinductive type, so that it should not be affected by coq/coq#7536.

@jhjourdan
Copy link
Contributor Author

I added another change: the fuel parameter is now specified using the logarithm (in base 2) of the maximum number of steps needed. In practice, we use the value 50, so that we are sure that we will not run out of fuel in reasonable computation time. From the perspective of CompCert, the advantage is that this prevents us from using the let rec inf = S inf hack.

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review everything line-by-line, but played with the branch, reviewed some of the diff, and had a look at some of the generated / extracted files.

Globally this looks fine to me. I'm very happy to see int31 gone. The cleaner and better typed interface with the handwritten OCaml code is welcome. I didn't see much difference in parsing time, but that's OK. I confirm that Parser.v now takes half as much time to check through Coq.

Below, a few cosmetic comments about the new Parser.vy, but nothing important.

I guess the next step is for you @jhjourdan and @fpottier to declare the Menhir side of things stable and make a Menhir release.

configure Outdated
missingtools=true;;
esac
# TODO
menhir_includes="-I `menhir --suggest-menhirLib`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me to fill in something there before I merge :-)

@jhjourdan jhjourdan force-pushed the newparser branch 2 times, most recently from ab6338d to 4086284 Compare June 18, 2019 07:59
The corresponding changes in Menhir have been released as part of
version 20190613. The `MenhirLib` directory is identical to the
content of the `src` directory of the corresponding `coq-menhirlib`
opam package except that:
   - In order to try to make CompCert compatible with several Menhir
     versions without updates, we do not check the version of menhir
     is compatible with the version of coq-menhirlib. Hence the
     `Version.v` file is not present in CompCert's copy.
   - Build-system related files have been removed.

More precisely, changes include:

1. A rewrite of the Coq interpreter of Menhir automaton, with
   dependent types removing the need for runtime checks for the
   well-formedness of the LR stack. This seem to cause some speedup on
   the parsing time (~10% for lexing + parsing).

2. Thanks to 1., it is now possible to avoid the use of int31 for
   comparing symbols: Since this is only used for validation,
   positives are enough.

3. Speedup of Validation: on my machine, the time needed for compiling
   Parser.v goes from about 2 minutes to about 1 minute. This seem to
   be related to a performance bug in the completeness validator and
   to the use of positive instead of int31.

3. Menhir now generates a dedicated inductive type for
   (semantic-value-carrying) tokens (in addition to the already
   existing inductive type for (non-semantic-value-carrying)
   terminals. The end result is that the OCaml support code for the
   parser no longer contain calls to Obj.magic. The bad side of this
   change is that the formal specification of the parser is perhaps
   harder to read.

4. The parser and its library are now free of axioms (I used to use
   axiom K and proof irrelevance for easing proofs involving dependent
   types).

5. Use of a dedicated custom negative coinductive type for the input
   stream of tokens, instead of Coq stdlib's `Stream`. `Stream` is a
   positive coinductive type, which are now deprecated by Coq.

6. The fuel of the parser is now specified using its logarithm instead
   of its actual value. This makes it possible to give large fuel
   values instead of using the `let rec fuel = S fuel` hack.

7. Some refactoring in the lexer, the parser and the Cabs syntax tree.
@jhjourdan jhjourdan changed the title WIP : New parser based on new version of the Coq backend of Menhir New parser based on new version of the Coq backend of Menhir Jun 18, 2019
@jhjourdan
Copy link
Contributor Author

The corresponding version of Menhir (20190613) have been released, I rebased/squashed the commits, and tested the whole thing.

As far as I can tell, this is now ready to merge.

@xavierleroy
Copy link
Contributor

Thank you. I'm preparing the CI machines here at Inria so that they can test this new version.

@xavierleroy
Copy link
Contributor

With Coq 8.7.2 under Cygwin, I get

16:54:35 File "./MenhirLib/Validator_complete.v", line 232, characters 6-62:
16:54:35 Error:
16:54:35 Anomaly
16:54:35 "File "plugins/ltac/tacinterp.ml", line 1157, characters 35-41: Assertion failed."

Will try other Coq versions.

@jhjourdan
Copy link
Contributor Author

Thanks for the test. Let me look into this more in detail.

@xavierleroy
Copy link
Contributor

After upgrading to Coq 8.9.1, still under Cygwin, I now get

COQC cparser/Parser.v
File "./cparser/Parser.v", line 217, characters 20-2939:
Error:
In environment
x : terminal
The term
 "match x with
  | ADD_ASSIGN't => 1
  | ALIGNAS't => 2
[...]
end" has type "nat" while it is expected to have type
"positive".

Coq 8.8.2 under Linux works fine, though.

I'll investigate on my side too.

@jhjourdan
Copy link
Contributor Author

The compatibility issue with 8.7.2 is now fixed in this branch (and in upstream Menhir).

The compatibility issue with 8.9.1 required a change in Menhir itself (in the generator, not in the support library). The fix will be shipped with the next Menhir release:

https://gitlab.inria.fr/fpottier/menhir/commit/5f195649fda98fda58c83e4ce956b41a0db4fec6

I hope @fpottier will not mind too much doing yet another Menhir release :D Thanks, @fpottier!

@jhjourdan
Copy link
Contributor Author

And sorry for still being a young Padawan too naive about compatibility between Cod versions! I should have checked that.

@fpottier
Copy link
Contributor

I'll be happy to make a new release (now that the release scripts work again, it is reasonably easy).

@jhjourdan, could you modify the opam descriptions of the two versions that we have released already (20190613 and 20190620) so as to document their compatibility constraints? This will avoid problems if someone installs one of these versions. Thanks!

@jhjourdan
Copy link
Contributor Author

This PR should now be fixed for both Coq 8.7 and Coq 8.9.

@xavierleroy
Copy link
Contributor

The tests are OK, and we agreed to merge, so let's merge!

Thanks for all the hard work.

@xavierleroy xavierleroy merged commit 998f3c5 into AbsInt:master Jul 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants