Avoid OCaml naked pointer in accumulators #12733

chambart · 2020-07-22T15:23:15Z

Using tag 0 on closure is going do be broken quite soon, so at some point this codes needs to change.

The only safe way to distinguish accumulators from other function is to record the set of all accumulator produced. To avoid leaking this is recorded in a weak table. Sadly the Weak module does not allow to use the physical equality function to retrieve values since it is applied on a copy. Ephemeron does not have this problem.

I don't know Coq enough to understand whether changing the tag to something else than 0 is a problem. And this is certainly slower than the original version.

To be honnest I have another solution but it is seriously evil and would prefer to avoid it as much as possible.

Kind: bug fix

Fixes / closes #10602

The only safe way to distinguish accumulators from other function is to record the set of all accumulator produced. To avoid leaking this is recorded in a weak table. Sadly the Weak module does not allow to use the physical equality function to retrieve values since it is applied on a copy. Ephemeron does not have this problem.

coqbot · 2020-07-22T15:23:16Z

Hello, thanks for your pull request!
In the future, we strongly recommend that you do not use master as the name of your branch when submitting a pull request.
By the way, you may be interested in reading our contributing guide.

SkySkimmer · 2020-07-22T15:41:30Z

What's the evil solution?

ppedrot · 2020-07-22T15:47:59Z

@chambart This will break all the compilation scheme, since pattern-matching compilation relies on accumulators having tag 0. The only sane solution I know to fix the naked pointer issue is to have a dedicated caml_apply function inserted in the translation phase.

ppedrot · 2020-07-22T16:15:36Z

Another alternative would be to have OCaml support for pattern-matching of the form

match foo with
| C1 ... -> ...
| Cn ... -> ...
| Closure_tag _ -> ...

with C1, ..., Cn are regular constructors.

AFAIU it's possible to generate intermediate flambda code for this kind of pattern-matching but (fortunately?) there is nothing user-facing for this kind of hack.

coqbot · 2020-07-22T16:15:43Z

For your complete information, the job test-suite:base+async in allow failure mode has failed for commit 6637530: Avoid producing nake pointers.

SkySkimmer · 2020-07-22T16:25:09Z

Can't we do something like

if Obj.tag foo = Closure_tag then ...
else match foo with ...

?

ppedrot · 2020-07-22T16:27:44Z

@SkySkimmer Yes, but then it's horribly inefficient. Obj.tag calls the page table check.

SkySkimmer · 2020-07-22T16:40:18Z

In future ocaml it won't though if I understand ocaml/ocaml@6e9d39b correctly.

ppedrot · 2020-07-22T16:44:41Z

It would still be less efficient than a jump through the pattern-matching table. We can probably roll up our own variant of is_closure in C that does not check the page table (since it knows that it's always a block or a direct integer) and see what happens.

silene · 2020-07-23T06:13:07Z

I am lost. How is the proposed patch related to naked pointers? In other words, which of the memory words of an accumulator is a naked pointed? Is the issue due to the first word being allocated on the static heap rather than the collected heap? If so, why not just change that?

silene · 2020-07-23T06:19:04Z

Replying to myself, I guess the operating system will complain in an abrupt way. The caml_apply solution would indeed work around this issue, but I expect it to be costly.

silene · 2020-07-23T07:06:43Z

So, there are three kinds of accumulators, those of inductive types that can be matched against, those of arrow types that can be applied, those of abstract types that can be nothing (?). Only accumulators that can be matched against need to have tag 0. For others, we do not care. But, if I am not mistaken, even in an inconsistent context, these three kinds of accumulators are exclusive. So, why do we bother changing the tag in the first place? Would it not work just fine if we leave the tags unchanged? Obviously, this means doing a bit more work at reconstruction time to recover all the accumulators, but this part is not performance sensitive.

ppedrot · 2020-07-23T08:09:27Z

But, if I am not mistaken, even in an inconsistent context, these three kinds of accumulators are exclusive.

Unfortunately I think you don't have enough runtime information to know that. How would you handle an accumulator of type forall b : bool, if b then unit else unit -> unit? More generally you'd need to simulate the whole CIC type system (in particular arbitrary conversion) in the compilation scheme of accumulators at runtime, which looks like a very inefficient thing.

silene · 2020-07-23T14:43:29Z

How would you handle an accumulator of type forall b : bool, if b then unit else unit -> unit?

That is not an inductive type, so a value of this type will never be matched against. But I see your point. Once applied, the resulting accumulator can be both matched and applied. So, even a simpler example such as an accumulator of type unit -> unit is already problematic, since it means that the accumulating function needs some typing information.

ppedrot · 2020-08-13T14:41:23Z

This PR is broken, and the correct fix is much more involved. I believe that the Coq dev team is acutely aware of the problem, in particular @maximedenes, but solving this requires a non-trivial amount of work. Given that, I am closing this PR and will be personally butt-kicking the knowledgeable people around (including myself) so that there is progress on the issue.

ppedrot · 2020-08-13T14:48:30Z

BTW @chambart if you're missing pieces of the picture, I'd be happy to discuss with you the internals of native compilation so that it's clearer why we use this horrible hack.

jhjourdan · 2020-08-15T19:59:44Z

I discussed with @gasche of this problem, and he suggested an elegant solution: in ocaml/RFCs#14, @yallop is proposing to add the possibility of unboxing some constructor of algebraic types.

In particular, (a small extension of) this RFC would allow unboxing algebraic type constructor whose content is a function (with a closure tag or an infix tag). From the perspective of this PR, if one adds an unboxed constructor of type Obj.t -> Obj.t to every translated Coq inductive type, this is essentially the same as allowing to discriminate over the closure tag in a pattern matching.

Now, the problem is that ocaml/RFCs#14 is still an RFC, and there still exists no implementation. So, @ppedrot, @maximedenes, could you please confirm that this would help for solving this problem? If so, could you please participate to the discussion in ocaml/RFCs#14 to make things move faster?

xavierleroy · 2020-08-17T08:32:21Z

For the record let me summarize the two plausible fixes that have been mentioned already:

Represent accumulators by functions (properly tagged Closure_tag) and compile pattern-matchings to

  if is_closure x then (* accumulator case *) else match x with ... (* normal cases *)

Represent accumulators by zero-tagged blocks and compile function applications f arg to

  if is_closure f then f arg (* normal case *) else ... (* accumulator case *)

Solution 2 is horribly expensive if all function applications are transformed, but could be OK if applications of known functions (i.e. variables let-bound to function abstractions) are not transformed, since in this case the function cannot be an accumulator.

Both solution need a fast is_closure test. Obj.tag is a little slow, although it will get faster in no-naked-pointers mode (no page table check). A C function declared "noalloc" should do for initial experiments. It should be possible to generate entirely inlined code for this test.

gasche · 2020-08-21T09:05:43Z

That's an excellent summary. If we used the "constructor unboxing" proposal as proposed by @jhjourdan, we would in effect be using your approach (1). The pattern-matching compiler would be in charge of deciding when to emit the is-closure check, it could be later in the matching process (after constant constructors have been checked, or even checked last, right before failures) so the cost would be smaller.

Note that there is a minor difference from @ppedrot's pseudo-code in #12733 (comment). Functions may have tag Closure_tag but also Infix_tag, so an implementation relying on unboxing a constructor at a function type would check for both.

xavierleroy · 2020-08-21T16:57:28Z

Functions may have tag Closure_tag but also Infix_tag, so an implementation relying on unboxing a constructor at a function type would check for both.

As far as I can remember, accumulators represented as function closures would always have Closure_tag tag, never Infix_tag. But you're right that my "solution 2" should be:

Represent accumulators by zero-tagged blocks and compile function applications f arg to

  if has_tag_0 f then (* accumulator case *) else f arg

ppedrot · 2020-08-21T17:19:45Z

I think that @xavierleroy 's first solution is the most reasonable one, but I am fairly sure that relying on the OCaml compiler to unbox things for us as in ocaml/RFCs#14 is going to be extremely fragile. Native compilation plays fast and loose with the type-checker, so in practice this is likely just to explode in flight.

chambart requested a review from a team as a code owner July 22, 2020 15:23

ejgallego added kind: compatibility Changes allowing for compatibility between versions. priority: high The priority for inclusion in the next release is high. labels Jul 22, 2020

SkySkimmer added the needs: benchmarking Performance testing is required. label Jul 22, 2020

ppedrot added the needs: fixing The proposed code change is broken. label Jul 22, 2020

ppedrot closed this Aug 13, 2020

jhjourdan mentioned this pull request Aug 15, 2020

Proposal: constructor unboxing ocaml/RFCs#14

Open

silene mentioned this pull request Aug 25, 2020

Modify bytecode representation of closures to please OCamls GC (fix #12636). #12894

Merged

ejgallego mentioned this pull request Mar 13, 2021

[vm/native] -no-naked-pointers support #13940

Open

ejgallego added the part: ocaml label Sep 27, 2021

gasche mentioned this pull request Apr 14, 2022

Bootstrap problem when moving Cont_tag to be after No_scan_tag ocaml/ocaml#11182

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid OCaml naked pointer in accumulators #12733

Avoid OCaml naked pointer in accumulators #12733

chambart commented Jul 22, 2020

coqbot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

ppedrot commented Jul 22, 2020

coqbot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

silene commented Jul 23, 2020

silene commented Jul 23, 2020 •

edited

Loading

silene commented Jul 23, 2020

ppedrot commented Jul 23, 2020

silene commented Jul 23, 2020

ppedrot commented Aug 13, 2020

ppedrot commented Aug 13, 2020

jhjourdan commented Aug 15, 2020

xavierleroy commented Aug 17, 2020

gasche commented Aug 21, 2020

xavierleroy commented Aug 21, 2020

ppedrot commented Aug 21, 2020

Avoid OCaml naked pointer in accumulators #12733

Avoid OCaml naked pointer in accumulators #12733

Conversation

chambart commented Jul 22, 2020

coqbot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

ppedrot commented Jul 22, 2020

coqbot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

SkySkimmer commented Jul 22, 2020

ppedrot commented Jul 22, 2020

silene commented Jul 23, 2020

silene commented Jul 23, 2020 • edited Loading

silene commented Jul 23, 2020

ppedrot commented Jul 23, 2020

silene commented Jul 23, 2020

ppedrot commented Aug 13, 2020

ppedrot commented Aug 13, 2020

jhjourdan commented Aug 15, 2020

xavierleroy commented Aug 17, 2020

gasche commented Aug 21, 2020

xavierleroy commented Aug 21, 2020

ppedrot commented Aug 21, 2020

silene commented Jul 23, 2020 •

edited

Loading