Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify start function in CM thread.spawn with an index #89

Open
abrown opened this issue Nov 25, 2024 · 25 comments
Open

Identify start function in CM thread.spawn with an index #89

abrown opened this issue Nov 25, 2024 · 25 comments

Comments

@abrown
Copy link
Collaborator

abrown commented Nov 25, 2024

When designing this proposal's integration with the component model (CM), we decided to pass the start function as a reference to the CM's thread.spawn intrinsic. IIRC, this avoided the issue of forcing the creation of a shared table to contain the start function. But, while doing pathfinding implementation to vet these design choices, I discovered that there are missing pieces in wasm-tools that make this "pass start as a reference" very difficult.

@alexcrichton and @fitzgen can correct me here if I'm not explaining this correctly, but my take is that the implementations in wasm-tools and Wasmtime don't have the necessary plumbing to refer to a CM core type from within a separate CM core module. No other CM intrinsics besides thread.spawn refer to a type declared elsewhere (see $ft here); if we are to pass a reference, we need a type for that reference and that type has an index, but not necessarily the same index, at different stages in the toolchain. @alexcrichton and @fitzgen tell me that adding the necessary index conversions for this new intrinsic in wasm-tools and Wasmtime is too much work; their recommendation is to switch the thread.spawn intrinsic to receive an i32 instead.

At this point, this issue is a roadblock before implementing enough to experiment with different TLS design options, which is what I believe needs more focused attention. The necessary bits are implemented in wasm-tools but this problem of "how to pass the start function" must to be resolved to have a working implementation in Wasmtime. Here are some options to resolve this:

  1. figure out some way to implement the missing plumbing between wasm-tools and Wasmtime; @alexcrichton and @fitzgen tell me this is the least desirable choice
  2. replace the signature of the CM thread.spawn: [f: (ref null $f) n:i32 c:i32] would become [f: i32 n: i32, c: i32]
  3. add an additional CM intrinsic, thread.spawn_indirect, with the new signature [f: i32 n: i32, c: i32]

Note that I am attempting to get this all working in wasi-libc, where one would expect that C function pointers would be placed in a table anyways. It's not clear to me that if we adopted 3 how we might implement the current version of thread.spawn and who would be using it, given how most projects tend to build on top wasi-libc.

@tlively
Copy link
Member

tlively commented Nov 26, 2024

that type has an index, but not necessarily the same index, at different stages in the toolchain.

Can you elaborate on this part? What are the different relevant stages and why does the type index change? There is a relocation R_WASM_TYPE_INDEX_LEB that would seemingly solve this problem if the toolchain could emit it in the correct location.

I suspect that part of the problem is that core Wasm doesn't support type parameters at all, and the component model probably doesn't want to be in the business of introducing them.

Another option would be to make the type of the parameter funcref and spec that the intrinsic performs a cast to the intended function type.

In general the idea of creating thread.spawn_indirect for now and adding a thread.spawn_ref later when GC langages are more prevalent on WASI makes sense to me.

@rossberg
Copy link
Member

Re 2: Wouldn't this require a table index immediate? Why is that easier to handle than a type index?

Re 3: Wouldn't spawn_indirect need a type index immediate as well? Similarly any solution using funcref?

AFAICS, somehow the intended type has to be (statically) derivable, so some form of immediate appears to be unavoidable.

@alexcrichton
Copy link
Contributor

To clarify a bit on why things are hard right now for wasmtime/wasm-tools, the precise reasons are pretty specific to the design of the two libraries themselves. The exact details are a bit much to put in this thread, but the general gist is that the full matrix of interactions of the GC proposal of WebAssembly and the component model have not all been fleshed out previously in Wasmtime. That's being stressed here with today's thread.spawn intrinsic because it generates a function where the parameter of that function is a typed function. This of course should work just fine in Wasmtime, and one day we'll of course have to do the work to get all that fixed. For the purposes of testing shared-everything-threads today, however, it's quite a lot for Andrew to take on as the reason things aren't working are basically unrelated to threads.

The other rationale we have for not using a function-reference-based-approach is that LLVM-based tooling basically isn't ready for that. AFAIK there's no way, today, to import the thread.spawn function with the right signature from C/Rust/etc. That means that actually binding it would involve wrappers/shims/etc. All of this is of course possible, but again for the purpose of testing shared-everything-threads this is quite a lot to take on.

I believe Nick and I were advocating for (3) here where we're not saying that thread.spawn should go away, only that in the meantime there's also thread.spawn_indirect. Wasmtime won't implement thread.spawn in this evaluation phase but we'd of course come back as this proposal goes through the design phases.

And yes, spawn_indirect would have an immediate as to which table it's referencing to load the index from.

@rossberg
Copy link
Member

Well, call_indirect has both a type and a table index. It's not clear to me how spawn_indirect could get around that?

@tlively
Copy link
Member

tlively commented Nov 26, 2024

spawn_indirect only needs to take a type index if it needs to support different function types. If there's just one hard-coded function type, then all it would need is a table index immediate. I'm not sure that makes things significantly easier, since LLVM also supports type index immediates, but it might.

@rossberg
Copy link
Member

@tlively, well yes, but isn't that orthogonal, and you could do the same with a ref? Not sure I follow how call_indirect helps.

@tlively
Copy link
Member

tlively commented Nov 26, 2024

Right, that's why I suggested that another short-term solution would be to use funcref rather than a typed function reference or a spawn_indirect variant.

@rossberg
Copy link
Member

Even if the parameter itself was funcref, you'd need to annotate its expected type somehow, otherwise the expected operand count and types wouldn't be known to the validator. Unless they're hardcoded in the semantics, but then there's no point in using funcref either. So again, I can't see how any of these options solves anything.

@tlively
Copy link
Member

tlively commented Nov 27, 2024

but then there's no point in using funcref either.

Why not? It still lets you dynamically choose which function to spawn the thread with, as long as it has the expected signature.

@rossberg
Copy link
Member

Yes, but what's the advantage over the status quo using a typed reference?

@tlively
Copy link
Member

tlively commented Nov 27, 2024

LLVM-based tooling would have a much easier time dealing with a funcref than dealing with a typed function reference.

@rossberg
Copy link
Member

@alexcrichton, I can't quite tell from your comment whether you are referring to function references in the signature generally being problematic for LLVM, or typed function references in particular. Can you clarify?

@alexcrichton
Copy link
Contributor

Personally I find both problematic. While LLVM tooling handles funcref there's at least no way in Rust for example to use funcref. IIRC the solution for C/C++ was to have a special compiler-defined type which has special rules about it (e.g. you can't take the address of it) but that doesn't work for Rust. This means that if the intrinsic here takes a funcref as an argument there's no way to actually call it from Rust. Rust will be forced to do the indirect table thing no matter what basically.

I'll reiterate I understand why the spec-level definition of this function takes a typed-function or a generic funcref, and I think that makes sense. Getting things working for Rust (and C/C++? unsure) can be considered "just a tooling problem go figure it out". I think that's reasonable too. @abrown's goal here is evaluation though as opposed to finishing a design, and in my opinion it would be far easier to evaluate spawn_indirect which takes a table index as an i32 and the intrinsic definition has a static immediate of which table to reference.

@tlively
Copy link
Member

tlively commented Dec 2, 2024

That sounds perfectly fine to me 👍

@conrad-watt
Copy link
Collaborator

This means that if the intrinsic here takes a funcref as an argument there's no way to actually call it from Rust. Rust will be forced to do the indirect table thing no matter what basically.

If the compilation scheme from rust with an "indirect table-style" operation would succeed, could the funcref version also work by emitting (table.get ...) (spawn_ref ...) to get the same effect?

the intrinsic definition has a static immediate of which table to reference

Is there a standard way of doing this in the CM (e.g. encoding the index in the import name)?

@alexcrichton
Copy link
Contributor

could the funcref version also work by emitting (table.get ...) (spawn_ref ...) to get the same effect?

Indeed! That would be more toolchain things to update to get to the point where things could be evaluated, so I could also understand if @abrown doesn't want to implement that quite just yet, but to me that's a possible argument for never standardizing the spawn_indirect intrinsic from prototyping.

Is there a standard way of doing this in the CM (e.g. encoding the index in the import name)?

Not currently unfortunately. Each intrinsic is a bit different from all the others in what it needs and what it can reference, so it's mostly a loose set of conventions right now for how components are created from core wasm modules. For example if you import a function that takes a string then the string has to be in some linear memory, and right now it's implicitly the export "memory" (although with WebAssembly/component-model#378 it'll be called cm32p2_memory). For this we'll have to invent a convention as well and assume that the table exported as __indirect_function_table (IIRC something like that is what wasm-ld does today) is intended for use with spawn_indirect.

@rossberg
Copy link
Member

rossberg commented Dec 2, 2024

@alexcrichton, do you mean spawn_indirect without a type index, and instead hardcoding the function type? If so, that sounds fine for experimentation, but it would obviously not be a "proper" design.

@alexcrichton
Copy link
Contributor

Yes that's what I've been thinking. I don't speak for Andrew though, so he might be thinking something else.

it would obviously not be a "proper" design

Personally I think this is debatable, not "obvious", but I'm happy to defer discussion of such to after experimentation/evaluation has been done.

@abrown
Copy link
Collaborator Author

abrown commented Dec 2, 2024

Here's what I'm thinking: I'm trying to carefully balance "moving this forward" with "doing things right." I remember how wasi-threads was "good enough" and that turned out to be a problem; it was a great experiment, but because it was good enough for some, it stuck around longer than others would have liked. If I create yet another prototype and it is "good enough," the low-level nature of this stuff and the prevalence of wasi-libc as a starting point could make it stick. So this time around it would be great if this spawn function signature was "right" from the beginning, since I expect more thrash on the thread local side of this anyways.

We've previously discussed a completely static start function: [c: i32] -> [] where the c points to a structure in memory that contains all the other data the thread may need (@rossberg do you remember where?). IIRC, @rossberg was against this on the grounds of giving languages freedom to choose how they want their start functions to look; this argument made sense to me, so we shifted to allow start functions with different signatures, which then meant we wanted to statically verify those signatures. I rehash all that to say: if we return to a completely static start function, this verification problem goes away and how we pass around the start function is easier. I actually would prefer not to do this as it bakes in a "C"-centric approach, but if we're already talking about passing around indexes to tables, maybe we're doing that anyways.

I am struggling to find the right balance here given the limitations of the toolchains, CM, state of implementation, etc. E.g.: is there really a way here to both move this forward and not permanently bake in "C"-centrism?

@tlively
Copy link
Member

tlively commented Dec 2, 2024

Is there another mechanism you can use to prevent people from using experimental features in production? Only enable it if an --i-promise-this-is-not-for-production flag is passed, perhaps? Name the source level function experimental_this_will_change_thread_spawn()?

@abrown
Copy link
Collaborator Author

abrown commented Dec 4, 2024

There probably is, but I guess I'm hoping this "start function" problem isn't the one to do that for. Maybe we just have to go with the C-like table + index approach... perhaps we should discuss in the next meeting?

@rossberg
Copy link
Member

rossberg commented Dec 5, 2024

(@rossberg do you remember where?)

The overview references issues #3 and #10 on this, which contains some of the discussion that I remember.

An index is fine for experiments, but baking it in long-term would be an awful hack, and highly biased one. If we go with that then I agree that we should find some way to make sure that it won't stuck.

@abrown
Copy link
Collaborator Author

abrown commented Dec 10, 2024

@rossberg, we discussed this in the threads subgroup today (a shame you couldn't attend!). Notes from the meeting are available here, but let me attempt to summarize where I ended up on this after the discussion:

  • there is no end-to-end change in expressivity by adding "spawn indirect:" I did not see this earlier (which is why I liked and agreed to go down the "spawn ref" route), but up at the component model level, the "spawn indirect" intrinsic will still have a type parameter describing the signature of the start function. That signature is lost down at the core level, but "preserved" somehow in the imported intrinsic name.
  • though the "spawn ref" form is preferable, for all users of wasi-libc (many) we still need what amounts to "spawn indirect;" @fitzgen's argument here is "why not just spec it then?" The extra indirection of "spawn indirect" — the table lookup and signature check — is unfortunate but, for languages based on wasi-libc, this would have to happen (hackily) anyways.
  • on whether we should spec (a) both "spawn ref" and "spawn indirect" or (b) just "spawn indirect" now and "spawn ref" later, I lean towards (b) to avoid confusion (the "spawn ref" thing will not be possible for some amount of time). But others advocated for (a), which is fine. There will be some time period were "spawn ref" is spec-ed but unimplemented, but that is no different than that we currently spec "any start function signature" but really "only allow [i32] -> []" for the time being.

@rossberg
Copy link
Member

Ah, thanks @brown for the summary. Looks like I missed the meeting again.

@tlively, can we please make sure that all meetings and their agendas are in the meetings repo? We shouldn't expect people to crawl issue lists across N repos to figure out what meetings are on.

As for the subject of discussion: having spawn_indirect with a restricted signature is okay with me, as long as it remains forward-compatible to general type signatures.

But I definitely think we need to have spawn_ref and a generalised spawnee type in the future — perhaps along with the component model properly supporting GC types. Because tables and memory do not make much sense for a reference- and GC-based language runtime, nor does a i32-based (i.e., pointer-based) spawn signature.

@fitzgen
Copy link
Contributor

fitzgen commented Dec 11, 2024

To clarify my position: I am in favor of spec'ing both spawn_ref and spawn_indirect. I am also in favor of spec'ing a fully general signature for spawn_indirect.

And as an intermediate, stepping-stone implementation state for purposes of expediently getting prototypes of a subset of everything we intend to spec into the hands of toolchains and early experimenters to play with, I am encouraging Andrew to start with implementing just the spawn_indirect instruction (potentially with a fixed signature if that is expedient) in Wasmtime and wasm-tools. Then, after that prototype exists, we can address the gnarly refactoring in wasm-tools that is required to support spawn_ref. I recognize that this prototyping path follows from accidental implementation details, nothing inherent. I am explicitly not arguing that we should only spec spawn_indirect and not spawn_ref because of tech debt that happens to exist in wasm-tools.

abrown added a commit to abrown/wasm-tools that referenced this issue Jan 30, 2025
As discussed in [bytecodealliance#89], this adds support for a new intrinsic,
`thread.spawn_indirect`. This new operation would allow spawning a
shared function stored in a table via a table index.

This leaves some future work undone:
- `thread.spawn` could/should be renamed to `thread.spawn_ref`
- `thread.spawn_indirect` could/should take the encoding byte from
  `thread.hw_concurrency`--swap `0x07` for `0x06`
- importantly, `thread.spawn_indirect` should gain a field indicating
  which type to expect in the indirect table; due to current limitations
  in `wasm-tools`, the locations to check once this is possible are
  marked with `TODO: spawn indirect types`.

[bytecodealliance#89]: WebAssembly/shared-everything-threads#89
abrown added a commit to abrown/wasm-tools that referenced this issue Feb 6, 2025
As discussed in [bytecodealliance#89], this adds support for a new intrinsic,
`thread.spawn_indirect`. This new operation would allow spawning a
shared function stored in a table via a table index.

This leaves some future work undone:
- `thread.spawn` could/should be renamed to `thread.spawn_ref`
- `thread.spawn_indirect` could/should take the encoding byte from
  `thread.hw_concurrency`--swap `0x07` for `0x06`
- importantly, `thread.spawn_indirect` should gain a field indicating
  which type to expect in the indirect table; due to current limitations
  in `wasm-tools`, the locations to check once this is possible are
  marked with `TODO: spawn indirect types`.

[bytecodealliance#89]: WebAssembly/shared-everything-threads#89
abrown added a commit to abrown/component-model that referenced this issue Feb 7, 2025
This change codifies the conclusions we arrived to in [WebAssembly#89]. It adds a
new way to spawn threads, `thread.spawn_indirect`, which retrieves the
thread start function from a table. This prompted me to rename
`thread.spawn` to `thread.spawn_ref`.

[WebAssembly#89]: WebAssembly/shared-everything-threads#89
abrown added a commit to abrown/component-model that referenced this issue Feb 7, 2025
This change codifies the conclusions we arrived to in [WebAssembly#89]. It adds a
new way to spawn threads, `thread.spawn_indirect`, which retrieves the
thread start function from a table. This prompted me to rename
`thread.spawn` to `thread.spawn_ref`.

[WebAssembly#89]: WebAssembly/shared-everything-threads#89
abrown added a commit to abrown/wasm-tools that referenced this issue Feb 7, 2025
As discussed in [bytecodealliance#89], this adds support for a new intrinsic,
`thread.spawn_indirect`. This new operation would allow spawning a
shared function stored in a table via a table index.

This leaves some future work undone:
- `thread.spawn` could/should be renamed to `thread.spawn_ref`
- `thread.spawn_indirect` could/should take the encoding byte from
  `thread.hw_concurrency`--swap `0x07` for `0x06`
- importantly, `thread.spawn_indirect` should gain a field indicating
  which type to expect in the indirect table; due to current limitations
  in `wasm-tools`, the locations to check once this is possible are
  marked with `TODO: spawn indirect types`.

[bytecodealliance#89]: WebAssembly/shared-everything-threads#89
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants