-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native execution of very simple WASM jobs #816
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Land it 🎉
(please make CI pass and resolve the conflicts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is AMAZING - really good work :-)
There is one shift I want to make and it's mainly to do with hoisting the WASM codepath up a level and away from the language job codepath.
Top level CLI command
Let's have a specific run_wasm.go
command file in cmd
and have wasm as a top level command like docker.
What has confused this is the language jobs where we say:
bacalhau run python ...
And for Docker we have:
bacalhau docker run ...
I think WASM is an equal to Docker so it's:
bacalhau wasm run ...
The confusion is the "language job" aspect of things (i.e. bacalhau run python
) is actually just a convenience wrapper around the bacalhau docker run
command.
WASM is very much a new - top level alternative to Docker, not an equivalent to the Python language job.
Sorry the existing language job stuff is confusing UX and in my opinion we might think about how that stuff works - i.e. that bacal hau run
is a really top level CLI command but it ends up as a wrapper around the Docker executor is confusing.
The most important thing is treat WASM the same as Docker in terms of layering so let's give it the same treatment (i.e. top level CLI command) - this leads onto:
Top level job spec
Same idea - currently we are using the language
field in the job spec to represent the WASM job.
I think we should have a new JobSpecWASM
struct and a new field in Spec
called WASM
Then - when we are running bacalhau wasm run
we are populating the Spec.WASM
field and not the Spec.Language
field.
How to reference WASM binary
I think using the language job feature of "upload some context" as a way to reference the program is a useful wrapper - but I think we should try to reference the WASM binary as a storage spec struct INSIDE the JobSpecWASM
field of the job. Ack that the language job stuff uses the "upload context" feature but let's move the WASM stuff away from language job and into a pure implementation of it's own.
Especially when we start putting WASM jobs as a step in a DAG - we won't have the concept of a context folder that was uploaded.
So - let's think "there is a WASM binary out on IPFS somewhere" - it could be:
bacalhau wasm run ipfs://Qm........
Which would turn into a JobSpecWASM
struct with a StorageSpec
field inside it that points to the WASM binary. We can then have a convenience wrapper around this that says "upload a context folder" to IPFS first and then include the CID in the job spec.
That is great feedback, thanks @binocarlos! The reuse of the language command was definitely a lazy step to avoid having to code all that stuff from scratch. Now that I've seen the thing hooked up and working, thinking about how all the commands are factored and having a proper spec was definitely high up on the todo list – so I'll scratch that now. I figured that we would at some point need to unify My take was that the choice would be Agreed that having a context folder was just a stop gap. One thing I haven't figured out yet is how to make it as easy to get a WASM binary onto IPFS as uploading a folder is (which I think is why this feature was added to the Python executor?) If running a dev stack, I can obviously put something there, but atm I don't need to have an IPFS client installed to use Bacalhau – is there a need for some command like Finally, the comment for |
Excellent good stuff - sounds like you have the right approach here then - I think my only worry was that the language job stuff confused things but it sounds like you've got a good sense of things (which is awesome in the first week ⭐ )
I think the main point of having
I think for now let's keep the clean seperation of top level job types (currently
Yup this is a good point - I think I'm coming at this from a pure - So clearly - using the language job context feature is a useful bridge to get that done - ultimately I think we should:
My thinking here is the @simonwo I think let's land this PR and then we can iterate on all the points above :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this and we can iterate on the design later - awesome work!
The `--context-path` flag bundles up stuff and puts it into the job spec for transport to the node. At the moment, the code assumes that what's passed is a directory, but it's perfectly acceptable to pass a file. This commit removes this assumption so single files can be passed using `--context-path`.
The "language" executor currently assumes that Python 3.10 is the only language and this is hard-coded in a number of places. This commit just removes this hard-coding and drives the executor based on the data in the supportedVersions field instead.
This commit adds an Executor that will download a blob of WASM and run it. The WASM to run and function to execute within it can be specified within the JobSpec and on the command-line. The supported WASM is required to be very simple: - It cannot require any imports; so no libraries or syscalls. - The function to call must not take any parameters. - The function to call must return a single int32. So this means it can't really deal in any input or output data yet, so this commit just sets up the infrastructure and demonstrates the concept. We can work out how to do this in the next iteration. The attached WASM used as test input calculates the day of Easter in April 2022, which is a complicated but purely numerical algorithm.
As discussed in #816, WASM is very much a new, top-level alternative to Docker, not an equivalent to the Python language job. The most important thing is treat WASM the same as Docker in terms of layering so let's give it the same treatment (i.e. top level CLI command).
This commit adds an Executor that will download a blob of WASM and run it. The WASM to run and function to execute within it can be specified within the JobSpec and on the command-line.
The supported WASM is required to be very simple:
So this means it can't really deal in any input or output data yet, so this commit just sets up the infrastructure and demonstrates the concept. We can work out how to do this in the next iteration.
The attached WASM used as test input calculates the day of Easter in April 2022, which is a complicated but purely numerical algorithm.