-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid OCaml naked pointer in accumulators #12733
Conversation
The only safe way to distinguish accumulators from other function is to record the set of all accumulator produced. To avoid leaking this is recorded in a weak table. Sadly the Weak module does not allow to use the physical equality function to retrieve values since it is applied on a copy. Ephemeron does not have this problem.
Hello, thanks for your pull request! |
What's the evil solution? |
@chambart This will break all the compilation scheme, since pattern-matching compilation relies on accumulators having tag 0. The only sane solution I know to fix the naked pointer issue is to have a dedicated caml_apply function inserted in the translation phase. |
Another alternative would be to have OCaml support for pattern-matching of the form
with AFAIU it's possible to generate intermediate flambda code for this kind of pattern-matching but (fortunately?) there is nothing user-facing for this kind of hack. |
For your complete information, the job test-suite:base+async in allow failure mode has failed for commit 6637530: Avoid producing nake pointers. |
Can't we do something like if Obj.tag foo = Closure_tag then ...
else match foo with ... ? |
@SkySkimmer Yes, but then it's horribly inefficient. |
In future ocaml it won't though if I understand ocaml/ocaml@6e9d39b correctly. |
It would still be less efficient than a jump through the pattern-matching table. We can probably roll up our own variant of |
I am lost. How is the proposed patch related to naked pointers? In other words, which of the memory words of an accumulator is a naked pointed? Is the issue due to the first word being allocated on the static heap rather than the collected heap? If so, why not just change that? |
Replying to myself, I guess the operating system will complain in an abrupt way. The |
So, there are three kinds of accumulators, those of inductive types that can be matched against, those of arrow types that can be applied, those of abstract types that can be nothing (?). Only accumulators that can be matched against need to have tag 0. For others, we do not care. But, if I am not mistaken, even in an inconsistent context, these three kinds of accumulators are exclusive. So, why do we bother changing the tag in the first place? Would it not work just fine if we leave the tags unchanged? Obviously, this means doing a bit more work at reconstruction time to recover all the accumulators, but this part is not performance sensitive. |
Unfortunately I think you don't have enough runtime information to know that. How would you handle an accumulator of type |
That is not an inductive type, so a value of this type will never be matched against. But I see your point. Once applied, the resulting accumulator can be both matched and applied. So, even a simpler example such as an accumulator of type |
This PR is broken, and the correct fix is much more involved. I believe that the Coq dev team is acutely aware of the problem, in particular @maximedenes, but solving this requires a non-trivial amount of work. Given that, I am closing this PR and will be personally butt-kicking the knowledgeable people around (including myself) so that there is progress on the issue. |
BTW @chambart if you're missing pieces of the picture, I'd be happy to discuss with you the internals of native compilation so that it's clearer why we use this horrible hack. |
I discussed with @gasche of this problem, and he suggested an elegant solution: in ocaml/RFCs#14, @yallop is proposing to add the possibility of unboxing some constructor of algebraic types. In particular, (a small extension of) this RFC would allow unboxing algebraic type constructor whose content is a function (with a closure tag or an infix tag). From the perspective of this PR, if one adds an unboxed constructor of type Now, the problem is that ocaml/RFCs#14 is still an RFC, and there still exists no implementation. So, @ppedrot, @maximedenes, could you please confirm that this would help for solving this problem? If so, could you please participate to the discussion in ocaml/RFCs#14 to make things move faster? |
For the record let me summarize the two plausible fixes that have been mentioned already:
Solution 2 is horribly expensive if all function applications are transformed, but could be OK if applications of known functions (i.e. variables let-bound to function abstractions) are not transformed, since in this case the function cannot be an accumulator. Both solution need a fast |
That's an excellent summary. If we used the "constructor unboxing" proposal as proposed by @jhjourdan, we would in effect be using your approach (1). The pattern-matching compiler would be in charge of deciding when to emit the is-closure check, it could be later in the matching process (after constant constructors have been checked, or even checked last, right before failures) so the cost would be smaller. Note that there is a minor difference from @ppedrot's pseudo-code in #12733 (comment). Functions may have tag Closure_tag but also Infix_tag, so an implementation relying on unboxing a constructor at a function type would check for both. |
As far as I can remember, accumulators represented as function closures would always have Closure_tag tag, never Infix_tag. But you're right that my "solution 2" should be:
|
I think that @xavierleroy 's first solution is the most reasonable one, but I am fairly sure that relying on the OCaml compiler to unbox things for us as in ocaml/RFCs#14 is going to be extremely fragile. Native compilation plays fast and loose with the type-checker, so in practice this is likely just to explode in flight. |
Using tag 0 on closure is going do be broken quite soon, so at some point this codes needs to change.
The only safe way to distinguish accumulators from other function is to record the set of all accumulator produced. To avoid leaking this is recorded in a weak table. Sadly the Weak module does not allow to use the physical equality function to retrieve values since it is applied on a copy. Ephemeron does not have this problem.
I don't know Coq enough to understand whether changing the tag to something else than 0 is a problem. And this is certainly slower than the original version.
To be honnest I have another solution but it is seriously evil and would prefer to avoid it as much as possible.
Kind: bug fix
Fixes / closes #10602