-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote caching should be a strategy #18245
Comments
For point #2, would changing AbstractSpawnStrategy.java:134 to
do it? |
I have a similar use case where I would like to configure different endpoints for the remote cache and remote executor. For example I have a remote cache service A but remote executor B. The remote executor B has its own implementation of remote cache. And what I want is for Bazel to check the remote cache to see if it has a hit. If not, Bazel will fallback to remote execution, upload all inputs, and let the remote executor perform the jobs. Is this possible in the current version? |
This is possible today. The caveat is your remote executor needs to upload the execution result to your remote cache service. Otherwise, Bazel will fail to fetch them after remote execution. |
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface.
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale. |
Just wanted to add that I think this is an important feature. For projects with smaller codebases but large dependencies, remote caching is only a win when limiting it to the large dependencies; the small projects can be built much faster locally, and incremental runtimes get dominated by artifact uploading. Today, the only real solution is to manually tag everything with "no-remote-cache-upload", which is not practical. If |
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944 Author: Julio Merino <julio.merino+oss@snowflake.com> Date: Fri Jul 14 10:32:41 2023 -0700 Description Testing
This new flag is similar in spirit to --remote_accept_cached but allows being more selective about what's accepted and what's not. The specific problem I face is the following: we have a setup where we want to use dynamic execution for performance reasons. However, we know some actions in our build (those run by rules_foreign_cc) are not deterministic. To mitigate this, we force the actions that we know are not deterministic to run remotely, without dynamic execution, as this will prevent exposing the non-determinism for as long as they are cached and until we can fix their problems. However, we still observe non-deterministic actions in the build and we need to diagnose what those are. To do this, I need to run two builds and compare their execlogs. And I need these builds to continue to reuse the non-deterministic artifacts we _already_ know about from the cache, but to rerun other local actions from scratch. Unfortunately, the fact that "remote-cache" is not a strategy (see bazelbuild#18245) makes this very difficult to do because, even if I configure certain actions to run locally unconditionally, the spawn strategy insists on checking the remote cache for them. With this new flag, I can run a build where the remote actions remain remote but where I disable the dynamic scheduler and force the remaining actions to re-run locally. I'm marking the flag as experimental because this feels like a huge kludge to paper over the fact that the remote cache should really be a strategy, but isn't. In other words: this flag should go away with a better rearchitecting of the remote caching interface. Upstream PR: bazelbuild#18944
Description of the feature request:
When configuring Bazel to use a remote cache without remote execution, the remote cache is not exposed as a strategy as far as I can tell. (Bazel does print
remote-cache
in the list of strategies when scheduling an action, but this value cannot be explicitly selected via strategy flags, so it is misleading.)The use of a remote cache should be a strategy that works well with any other strategy, such as dynamic, and can be selectively enabled for individual actions.
What underlying problem are you trying to solve with this feature?
We are facing two issues when remote caching is enabled:
remote-cache
magic identifier cannot be passed to--strategy
. More context in Expose workspace provenance for strategy selection #18244.Which operating system are you running Bazel on?
N/A
What is the output of
bazel info release
?bazel-6.1.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: