Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native execution of very simple WASM jobs #816

Merged
merged 5 commits into from
Oct 7, 2022
Merged

Native execution of very simple WASM jobs #816

merged 5 commits into from
Oct 7, 2022

Conversation

simonwo
Copy link
Contributor

@simonwo simonwo commented Oct 6, 2022

This commit adds an Executor that will download a blob of WASM and run it. The WASM to run and function to execute within it can be specified within the JobSpec and on the command-line.

The supported WASM is required to be very simple:

  • It cannot require any imports; so no libraries or syscalls.
  • The function to call must not take any parameters.
  • The function to call must return a single int32.

So this means it can't really deal in any input or output data yet, so this commit just sets up the infrastructure and demonstrates the concept. We can work out how to do this in the next iteration.

The attached WASM used as test input calculates the day of Easter in April 2022, which is a complicated but purely numerical algorithm.

Copy link
Contributor

@lukemarsden lukemarsden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Land it 🎉

@lukemarsden
Copy link
Contributor

(please make CI pass and resolve the conflicts)

Copy link
Contributor

@binocarlos binocarlos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is AMAZING - really good work :-)

There is one shift I want to make and it's mainly to do with hoisting the WASM codepath up a level and away from the language job codepath.

Top level CLI command

Let's have a specific run_wasm.go command file in cmd and have wasm as a top level command like docker.

What has confused this is the language jobs where we say:

bacalhau run python ...

And for Docker we have:

bacalhau docker run ...

I think WASM is an equal to Docker so it's:

bacalhau wasm run ...

The confusion is the "language job" aspect of things (i.e. bacalhau run python) is actually just a convenience wrapper around the bacalhau docker run command.

WASM is very much a new - top level alternative to Docker, not an equivalent to the Python language job.

Sorry the existing language job stuff is confusing UX and in my opinion we might think about how that stuff works - i.e. that bacal hau run is a really top level CLI command but it ends up as a wrapper around the Docker executor is confusing.

The most important thing is treat WASM the same as Docker in terms of layering so let's give it the same treatment (i.e. top level CLI command) - this leads onto:

Top level job spec

Same idea - currently we are using the language field in the job spec to represent the WASM job.

I think we should have a new JobSpecWASM struct and a new field in Spec called WASM

Then - when we are running bacalhau wasm run we are populating the Spec.WASM field and not the Spec.Language field.

How to reference WASM binary

I think using the language job feature of "upload some context" as a way to reference the program is a useful wrapper - but I think we should try to reference the WASM binary as a storage spec struct INSIDE the JobSpecWASM field of the job. Ack that the language job stuff uses the "upload context" feature but let's move the WASM stuff away from language job and into a pure implementation of it's own.

Especially when we start putting WASM jobs as a step in a DAG - we won't have the concept of a context folder that was uploaded.

So - let's think "there is a WASM binary out on IPFS somewhere" - it could be:

bacalhau wasm run ipfs://Qm........

Which would turn into a JobSpecWASM struct with a StorageSpec field inside it that points to the WASM binary. We can then have a convenience wrapper around this that says "upload a context folder" to IPFS first and then include the CID in the job spec.

@simonwo
Copy link
Contributor Author

simonwo commented Oct 7, 2022

That is great feedback, thanks @binocarlos!

The reuse of the language command was definitely a lazy step to avoid having to code all that stuff from scratch. Now that I've seen the thing hooked up and working, thinking about how all the commands are factored and having a proper spec was definitely high up on the todo list – so I'll scratch that now.

I figured that we would at some point need to unify bacalhau docker run and bacalhau run python behind some common interface that will scale as we add more executors (and could e.g. allow users to define their own "templates" which are first class – e.g. bacalhau run my template).

My take was that the choice would be bacalhau run <…> because at the moment the Docker cmd doesn't seem to have any other subcommands and I can't think what would go underneath docker other than run. But also, clearly use of bacalhau docker run is already prevalent and mirrors the docker run command, so changing that would be hard. Happy to go with bacalhau wasm run though, but what do you think of eventually moving to e.g. bacalhau run docker?

Agreed that having a context folder was just a stop gap. One thing I haven't figured out yet is how to make it as easy to get a WASM binary onto IPFS as uploading a folder is (which I think is why this feature was added to the Python executor?) If running a dev stack, I can obviously put something there, but atm I don't need to have an IPFS client installed to use Bacalhau – is there a need for some command like bacalhau devstack put that will upload something to a dev stack and return the CID using the builtin IPFS client?

Finally, the comment for Spec.Contexts says "Input volumes that will not be sharded, for example to upload code into a base image. Every shard will get the full range of context volumes". Is that the right place to put a WASM binary StorageSpec, rather than in it's own job spec extension?

@binocarlos
Copy link
Contributor

Now that I've seen the thing hooked up and working, thinking about how all the commands are factored and having a proper spec was definitely high up on the todo list – so I'll scratch that now.

Excellent good stuff - sounds like you have the right approach here then - I think my only worry was that the language job stuff confused things but it sounds like you've got a good sense of things (which is awesome in the first week ⭐ )

My take was that the choice would be bacalhau run <…> because at the moment the Docker cmd doesn't seem to have any other subcommands and I can't think what would go underneath docker other than run

I think the main point of having bacalhau docker run rather than bacalhau run XXX is both:

  • we can now have sub-commands specific to the executor - for example bacalhau wasm upload
  • context specific flags are cleanly seperated
  • it reflects the split between the job specs - i.e.
    • bacalhau docker run -> JobSpecDocker
    • bacalhau wasm run -> JobSpecWASM

what do you think of eventually moving to e.g. bacalhau run docker

I think for now let's keep the clean seperation of top level job types (currently docker vs wasm - let's ignore language as a UX improvement)

Agreed that having a context folder was just a stop gap. One thing I haven't figured out yet is how to make it as easy to get a WASM binary onto IPFS as uploading a folder is (which I think is why this feature was added to the Python executor?) If running a dev stack, I can obviously put something there, but atm I don't need to have an IPFS client installed to use Bacalhau – is there a need for some command like bacalhau devstack put that will upload something to a dev stack and return the CID using the builtin IPFS client?

Yup this is a good point - I think I'm coming at this from a pure - assume there is a CID on IPFS with the WASM binary already angle and then leaving the yes but how the hell do I get that down to the user

So clearly - using the language job context feature is a useful bridge to get that done - ultimately I think we should:

  • run WASM jobs that live on IPFS (because this unlocks various workflows)
  • give the user easy ways to get their WASM onto IPFS (so the UX is nice)

Finally, the comment for Spec.Contexts says "Input volumes that will not be sharded, for example to upload code into a base image. Every shard will get the full range of context volumes". Is that the right place to put a WASM binary StorageSpec, rather than in it's own job spec extension?

My thinking here is the JobSpecDocker has all the info required for the executor to run a program just minus the storage inputs/outputs (so Image, Entrypoint, Env etc) so the equivalent would be that JobSpecWASM has all the info required for the executor to run the program (so WASM binary localtion)

@simonwo I think let's land this PR and then we can iterate on all the points above :-)

Copy link
Contributor

@binocarlos binocarlos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this and we can iterate on the design later - awesome work!

The `--context-path` flag bundles up stuff and puts it into the job
spec for transport to the node. At the moment, the code assumes that
what's passed is a directory, but it's perfectly acceptable to pass a
file.

This commit removes this assumption so single files can be passed
using `--context-path`.
The "language" executor currently assumes that Python 3.10 is the only
language and this is hard-coded in a number of places. This commit
just removes this hard-coding and drives the executor based on the
data in the supportedVersions field instead.
This commit adds an Executor that will download a blob of WASM and run
it. The WASM to run and function to execute within it can be
specified within the JobSpec and on the command-line.

The supported WASM is required to be very simple:

- It cannot require any imports; so no libraries or syscalls.
- The function to call must not take any parameters.
- The function to call must return a single int32.

So this means it can't really deal in any input or output data yet, so
this commit just sets up the infrastructure and demonstrates the
concept. We can work out how to do this in the next iteration.

The attached WASM used as test input calculates the day of Easter in
April 2022, which is a complicated but purely numerical algorithm.
As discussed in #816, WASM is very much a new, top-level alternative
to Docker, not an equivalent to the Python language job.

The most important thing is treat WASM the same as Docker in terms of
layering so let's give it the same treatment (i.e. top level CLI
command).
@simonwo simonwo merged commit f014037 into main Oct 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants