-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Change the semantics of @frame
to allow for inlining async calls
#5277
Comments
Here's an example of the sort of optimization that is impossible: fn a() u64 {
return b(4);
}
fn b(val: u32) u32 {
suspend { global_frame = @frame(); }
return val;
} In the optimal case, this would be optimized to fn a() u64 {
suspend { global_frame = @frame(); }
return 4;
} But this code is not equivalent with the current semantics. |
I played around with a couple of test cases. To document my own conclusions (5&6):
Also, if the primary use case of |
That all makes sense. I think that because of the colorless function abstraction, direct async await pairs will be very common and should be easy to optimize. |
I think we can accomplish getting the frame explicitly into the function with a wrapper like this if it's really necessary: fn asyncCallWithFrame(func: var, frame: *@Frame(func), other_args: var) void {
_ = @asyncCall(asBytes(frame), func, .{frame} ++ other_args);
} If there were a way to access the return location in a function, it could be made more ergonomic in the case of stack-allocated frames: fn asyncCallWithFrame(func: var, other_args: var) @Frame(func) {
const frame = @resultLocation();
_ = @asyncCall(asBytes(frame), func, .{frame} ++ other_args);
} That said, there is only one thing I can think of that you can do that would require an awaitable reference to your own frame, which is to spawn an async function that will await the function that spawned it. fn frameAwareAsyncFn(frame: *@Frame(frameAwareAsyncFn), allocator: *Allocator) !void {
try doStuffPart1();
// kick off a continuation function that will await this one
// Footgun: the frame must be heap allocated because the continuation function will
// outlive this frame.
const continuingFrame = try allocator.create(@Frame(doMoreStuff));
// Footgun: If any `try` fails above this point, the caller must await this frame to
// recover the error. If the try succeeds, the async call will await this frame and
// the caller must not. This can potentially be resolved by the programmer with a
// lot of work, but it turns into a huge mess of stuff that is essentially
// reimplementing continuations at user level.
_ = @asyncCall(asBytes(continuingFrame), doMoreStuff, .{allocator});
// Footgun: somehow, `continuingFrame` must be awaited. We cannot await it here
// because it is waiting on us, so that would cause a circular await chain that would
// never resume. So now we have to store it in some external state that knows it
// must await it from a different execution path.
global_awaiters.enqueue(continuingFrame);
try doStuffPart2();
}
fn doMoreStuff(prev_frame: anyframe->error{...}!void) !void {
try doStuffPart3();
// Footgun: If this fails, the caller must do the await, same as above.
// If it succeeds, this function will await and the caller must not await.
// Doing `defer await prev_frame` would be ideal, but prev_frame returns an error
// and you can't put a `try` in a `defer`.
try await prev_frame;
try doStuffPart4();
} The following code calls all of the fn doStuffTheEasyWay() !void {
try doStuffPart1();
var part3 = async doStuffPart3();
var part2 = async doStuffPart2();
(await part3) catch |err| { (await part2) catch {}; return err; };
try await part2;
try doStuffPart4();
} |
Note that casting the In fact, since |
Sorry, I misunderstood. I interpreted the suggestion as "make downcasting |
I'm becoming more and more convinced that being able to await
Alternatively and equivalently, the caller could
More generally, if awaiting the frame is invalid, it gives the nice property that any call to
If there are any examples where |
I'm not sure the typing change would be all that helpful for this particular optimization, since inlining already makes a local copy to modify it in aggressive optimizations. I think a separate issue/proposal would be a better place to discuss that change. In short I would support returning I think Andrew knows the implementation best, so he can provide input on whether or not the typing change to |
I think it's helpful to have an explicit statement of intent, so here's my interpretation, I will edit this if incorrect. Key idea:
Additionally, in safe modes, we include a four-state field in a frame to indicate whether it is:
In unsafe modes, the first three are collapsed, and it becomes a flag (for |
I went spelunking in some generated async assembly last weekend, and came out with the realization that the current semantics of async, and
@frame
in particular, are preventing potential optimization opportunities. I think we can change them without breaking any actual use cases.The problem comes from three properties:
@frame()
returns@Frame(@thisFn())
, which means each async function must have its own frame typeanyframe
resume
must work onanyframe
, which means each frame must have self-identifying informationThese three things taken together mean that an async function can never be inlined into another async function, because doing so would make
@frame()
return the wrong type. So for this code:b
must always have its own frame type, so it cannot be inlined intoa
a
must have its own frame type (the return type from the async call), so it cannot be inlined intomain
.This means that wrapping
suspend
in a library will always have a cost that can not be optimized away, which is a problem.Obviously properties (2) and (3) are core parts of Zig's async, and shouldn't be changed. But I think we can get away with relaxing (1). If we change
@frame()
to returnanyframe
, it means that async functions are now eligible for inlining by expanding the frame of the parent function. We would also have to slightly change the definition of@Frame(foo)
to be "the type returned byasync foo()
".This would also allow a potential further optimization where frames are only generated by the keyword
async
, and all of the data needed to resume a frame is stored in one place at the beginning of that "call tree frame". Which would have the nice side-effect of makingvar x = async a(); resume x;
valid ifa()
callsb()
andb
suspends.Since copying, moving, or examining the contents of existing frames are not supported use cases, and awaiting your own frame is not allowed, I don't think this change breaks any existing use cases.
The text was updated successfully, but these errors were encountered: