-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reusable run-build workflow #44
Conversation
Add `write-envdir` script for managing envvars that the Nextstrain CLI does not automatically forward to builds. This will be used in our resuable workflow for running Nextstrain builds. Originally written by @tsibley, copied unmodified from the ncov-ingest repo.¹ ¹ https://github.com/nextstrain/ncov-ingest/blob/2a8f5bb4419998d5e3e13c97ebd03a1780ea47c2/bin/write-envdir
Modifies the "Set environment variables" step from the pathogen-repo-ci workflow¹ into two separate scripts since we will use the same method to set environment variables to other reusable workflows. I decided to create `json-to-envvars` as a separate script that `yaml-to-envvars` calls because I will be using it to set secrets as environment variables as well. The `--varnames` option will allow us to be explicit about which secrets we want to set as environment variables. ¹ https://github.com/nextstrain/.github/blob/cc6f4385a45bd6ed114ab4840416fd90cc46cd1b/.github/workflows/pathogen-repo-ci.yaml#L183-L198
Modifies the "Generate summary" step from the conda-base repo CI workflow¹ into a separate script that can be used to interpolate any text template with environment variables. I will be using this in the reusable run-build workflow to create the GitHub Action summary for AWS Batch builds. I also see this being useful for creating Slack messages for automated builds such as the onstart and onerror notifications that we use in ncov² if we decide to move the notifications out of the pathogen workflows and into the GitHub Action workflows. ¹ https://github.com/nextstrain/conda-base/blob/5655133be6864ede0acbf40b6b06222f9f2dc031/.github/workflows/ci.yaml#L107-L112 ² https://github.com/nextstrain/ncov/blob/fcac1d16410b1b119a9af70afe7b9325a6e25083/workflow/snakemake_rules/export_for_nextstrain.smk#L499-L519
Add a reusable workflow to run Nextstrain builds so that we can centralize and standardize how our pathogen repos set up these builds. Motivated by the latest hiccup with the Snakemake upgrade that required fixes across multiple pathogen repos.¹ There are some "hacky" steps to get around the limitations of reusable workflows: 1. The 'ref' input is required to ensure that we checkout the matching ref within the reusable workflow since it is currently not possible to detect the called workflow's ref.² 2. Since reusable workflows cannot access environment variables set in the calling workflow, this uses the yaml-to-envvars workaround with an `env` input already used in the patheogen-repo-ci workflow.³ 3. We use a json-to-envvars workaround to allow for setting secrets as environment variables for the build runtime dynamically. ¹ https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1681328249848399 ² actions/toolkit#1264 ³ https://github.com/nextstrain/.github/blob/cc6f4385a45bd6ed114ab4840416fd90cc46cd1b/.github/workflows/pathogen-repo-ci.yaml#L25-L45
Notice this is calling the local run-build workflow with the input ref pointing at the commit SHA that triggered the workflow so they should be using the same ref.
.github/workflows/run-build.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of overlap with the pathogen-repo-ci
workflow, but I think okay to keep separate for better separation of concerns?
We could go down the nested reusable workflow route, but I haven't really tested that to know if there are any pitfalls.
A more straightforward path is to refactor the pathogen-repo-ci
workflow to use the same ref
input method and use the same local scripts that I had pulled out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think fine to keep separate, especially for now. If they continue to be highly parallel over time, we can converge them then.
.github/workflows/run-build.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with this, and I think worth resolving/addressing all the comments over in #39 as part of the review/iteration on this PR.
c0f10a0
to
a57a1b6
Compare
Noting that this is just fine. We really only need one local runtime (e.g. Docker) and one remote runtime (e.g. AWS Batch) for the goals of this workflow. The Conda runtime is probably easy enough to support to allow it, but we're best off preferring containerized runtimes (e.g. Docker, Singularity, AWS Batch) for their better isolation/reproducibility/robustness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this version as a workflow is nicer and a better direction! Thanks for doing it. Lots of little comments, mostly minor and about polish/consistency.
This provides an `envdir` program to our Conda runtime since our other runtimes also provide this useful program and we've felt its absence.¹ Since there's no existing Conda packaging of the Python envdir distribution² we use elsewhere or any other `envdir` program AFAICT³, the particular program provided is a port I wrote long ago⁴ and still use daily on my own machines. It's vendored into this repository and built into the Conda packages we're already producing here. This is a quick and low-overhead solution, and we may want to switch to a packaging of the Python envdir distribution later (particularly if we want to use the Python envdir module that distribution also provides). Resolves <#35>. ¹ <nextstrain/.github#44 (comment)> ² <https://pypi.org/project/envdir/> ³ I checked for both daemontools, which provides the original, and s6, which provides another port, as well as a general search for "envdir". ⁴ <https://github.com/tsibley/envdir>
34102a6
to
29e0a58
Compare
Convert the null values to empty strings so that we do not set the value "null" as any environment variable values. I had considered just skipping key/value pairs where the value is null, but I think I'd want to see that the environment variable is empty in the GitHub Action logs. This change was motivated by a failed test seasonal-flu build on AWS Batch. The workflow passed `NEXTSTRAIN_DOCKER_IMAGE: null` to the Nextstrain CLI and the AWS Batch job tried to pull a `null` image. With this change, the Nextstrain CLI is able to see that the environment variable as a falsy value and the AWS Batch job pulled the default latest image.
Include a direct URL to the AWS Batch console in our AWS Batch summary as requested by @jameshadfield¹ Note: in order for the URL in the summary to _not_ be masked, the repo cannot have the `us-east-1` value as a secret. This means we would need to audit repos of the `AWS_DEFAULT_REGION` secret and remove them where possible. ¹ #39 (comment)
Update input descriptions based on feedback from @jameshadfield in review: - #39 (comment) - #39 (comment)
Adds a composite action that can be used to get a reusable workflow's own context within itself. Modified from @tsibley's prototype: https://github.com/tsibley/github-actions-testing/blob/7fbf8f0e895c6e80f2722e3a08925a4e27536cc9/.github/workflows/oidc.yaml#L6-L25 This will allow us to use actions and scripts within the `run-build` workflow at the same calling ref without asking the calling workflow to provide the `ref` input.
Replace the `ref` input with our new `workflow-context` composite action to get the repository and sha of the called reusable workflow of the current run. This removes the burden of having to provide the ref as an input by the caller workflow. However, the caller workflow does need to provide the permissions `id-token: write` in order to use this workflow. This has been reflected in the workflow-template.
Rename to `bin/interpolate-env` to be clearer on what the script is doing and require caller to provide text via stdin for better interoperability.
Because "build" is such an overloaded term, rename the workflow to be more specific and to parallel the existing `pathogne-repo-ci` workflow.
Includes a default set of paths that are the common output paths for pathogen builds based on the artifact paths used in pathogen-repo-ci.
29e0a58
to
f82eba9
Compare
Set the default shell to the same command as the GitHub Action `bash` keyword¹ Completely spelling out the bash command here so that GitHub can't change it out from under us and we don't have to refer to the docs to know the expected behavior. ¹ https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsshell
9dda436
to
ce76a84
Compare
Switch the directory organization within the workflow so that the nextstrain/.github repo is in a subdir and the pathogen repo is now in the top level working dir. This ensures that we don't have to worry about prefixing the paths of the artifact paths. We are cloning the nextstrain/.github repo into a hardcoded excluded path¹ for AWS Batch jobs so that it does not get uploaded with the job. I had explored other ways to try to exclude the extra support files from the AWS Batch jobs, but they required some extra workarounds that we don't want to deal with due to limitations of GitHub Actions.² ¹ https://github.com/nextstrain/cli/blob/6e36f6d7bdafcb984f62e2a9cc2a9cf9adf0be57/nextstrain/cli/runner/aws_batch/s3.py#L51-L66 ² #44 (comment)
Switch the directory organization within the workflow so that the nextstrain/.github repo is in a subdir and the pathogen repo is now in the top level working dir. This ensures that we don't have to worry about prefixing the paths of the artifact paths. We are cloning the nextstrain/.github repo into a hardcoded excluded path¹ for AWS Batch jobs so that it does not get uploaded with the job. I had explored other ways to try to exclude the extra support files from the AWS Batch jobs, but they required some extra workarounds that we don't want to deal with due to limitations of GitHub Actions.² ¹ https://github.com/nextstrain/cli/blob/6e36f6d7bdafcb984f62e2a9cc2a9cf9adf0be57/nextstrain/cli/runner/aws_batch/s3.py#L51-L66 ² #44 (comment)
7155b1c
to
9c96172
Compare
Switch the directory organization within the workflow so that the nextstrain/.github repo is in a subdir and the pathogen repo is now in the top level working dir. This ensures that we don't have to worry about prefixing the paths of the artifact paths. We are cloning the nextstrain/.github repo into a hardcoded excluded path¹ for AWS Batch jobs so that it does not get uploaded with the job. I had explored other ways to try to exclude the extra support files from the AWS Batch jobs, but they required some extra workarounds that we don't want to deal with due to limitations of GitHub Actions.² ¹ https://github.com/nextstrain/cli/blob/6e36f6d7bdafcb984f62e2a9cc2a9cf9adf0be57/nextstrain/cli/runner/aws_batch/s3.py#L51-L66 ² #44 (comment)
9c96172
to
b9983c0
Compare
Switch the directory organization within the workflow so that the nextstrain/.github repo is in a subdir and the pathogen repo is now in the top level working dir. This ensures that we don't have to worry about prefixing the paths of the artifact paths. We are cloning the nextstrain/.github repo into a hardcoded excluded path¹ for AWS Batch jobs so that it does not get uploaded with the job. I had explored other ways to try to exclude the extra support files from the AWS Batch jobs, but they required some extra workarounds that we don't want to deal with due to limitations of GitHub Actions.² ¹ https://github.com/nextstrain/cli/blob/6e36f6d7bdafcb984f62e2a9cc2a9cf9adf0be57/nextstrain/cli/runner/aws_batch/s3.py#L51-L66 ² #44 (comment)
b9983c0
to
aa7ce26
Compare
Switch the directory organization within the workflow so that the nextstrain/.github repo is in a subdir and the pathogen repo is now in the top level working dir. This ensures that we don't have to worry about prefixing the paths of the artifact paths. We are cloning the nextstrain/.github repo into a hardcoded excluded path¹ for AWS Batch jobs so that it does not get uploaded with the job. I had explored other ways to try to exclude the extra support files from the AWS Batch jobs, but they required some extra workarounds that we don't want to deal with due to limitations of GitHub Actions.² ¹ https://github.com/nextstrain/cli/blob/6e36f6d7bdafcb984f62e2a9cc2a9cf9adf0be57/nextstrain/cli/runner/aws_batch/s3.py#L51-L66 ² #44 (comment)
Include the default artifact paths in the upload-artifact step so that callers do not have to copy/paste defaults to add an additional path. Callers will have to add exclusion patterns for any default paths that they do not want to include in the uploaded artifact.
c32bdca
to
1097fc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good! A couple small fixups I noticed…
if [[ "$NEXTSTRAIN_BUILD_COMMAND" == "nextstrain build "* ]]; then | ||
eval "$NEXTSTRAIN_BUILD_COMMAND" | tee build.log | ||
else | ||
echo "::error::The run input must be a `nextstrain build` command" | ||
exit 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking
@joverlee521 wrote elsewhere about this bit of code:
Updated in latest push. I would love to know if there's a better way to run the provided command besides using
eval
.
Instead of eval
you could directly interpolate at the workflow level (i.e. pre-execution):
if [[ "$NEXTSTRAIN_BUILD_COMMAND" == "nextstrain build "* ]]; then | |
eval "$NEXTSTRAIN_BUILD_COMMAND" | tee build.log | |
else | |
echo "::error::The run input must be a `nextstrain build` command" | |
exit 1 | |
fi | |
${{ inputs.run }} |& tee build.log |
The indirection of defining NEXTSTRAIN_BUILD_COMMAND
isn't needed for input robustness since we're running it as a bit of shell anyway. I'd also not worry about if the input starts with nextstrain build
or not. Maybe emitting a warning is useful? but wouldn't enforce a restriction.
Would also direct stderr to the build log (|&
), not just stdout (|
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. Updated in force-push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, slight hiccup with direct interpolation of ${{ inputs.run }}
. This retains newlines at the end of the command resulting in a syntax error in a test seasonal-flu run. I thought changing the run block style to >
in 357a79b would replace the newline with a space, but that did not work.
Going back to eval again in 8440ef6 and this worked as expected for the test seasonal-flu run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah. Using the >
block style won't work because GitHub's ${{ … }}
interpolation is evaluated after parsing the YAML (an ordering which is quite important). An alternative that I think would work is to setup the redirection before hand rather than as a pipeline, e.g.:
exec &> >(tee build.log)
${{ inputs.run }}
But I think also fine to keep eval
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively alternatively, you could also encase the thing in a subshell:
( ${{ inputs.build.log }} ) | tee build.log
or group command:
{ ${{ inputs.build.log }} ; } | tee build.log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But really fine to keep eval
.)
The aws-batch runtime requires the secrets: | ||
|
||
- AWS_ACCESS_KEY_ID | ||
- AWS_SECRET_ACCESS_KEY | ||
|
||
They must be defined in the repo's Actions secrets and passed to this | ||
workflow with `secrets: inherit`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking
We can loosen this restriction in the future by having this workflow automatically arrange for short-lived session keys using the OIDC token exchange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, we would still have to require an input from the caller workflow for the role-to-assume, but that can just be a plain string input rather than a secret or environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we'd have a single role used by all our AWS Batch builds, so the workflow here could assume it without further input.
Instead of recapitulating the `nextstrain build` options as inputs, use a single run input for callers to provide the full command. This seems like a simpler transition to be able to copy/paste the full `nextstrain build` command already used locally to the automated GH Action. The expected `nextstrain build` commands are also simplified and will no longer need to use the `--exec` option since the Nextstrain CLI now has the `--env` option to pass environment variables to the build runtime. This allows us to remove the write-envdir script and any references to envdir.
Allow callers to just pass secrets via `secrets: inherit` so that we don't have to maintain a list of allowed secrets. The only required/predefined secrets for setting up the aws-batch runtime are added to the input description. All other secrets for builds can have arbitrary names and can be passed to the build runtime via the `--env` option.
357a79b
to
8440ef6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
(Remember to drop that testing-only commit before merge.)
Using `eval` to run the command to avoid potential errors as we don't want limit the block scalars for the run input. If run input includes a newline at the end, the direct interpolation of `${{ inputs.run }}` results in a syntax error¹: ``` line 11: syntax error near unexpected token `|&' ``` ¹ https://github.com/nextstrain/seasonal-flu/actions/runs/5604334827/jobs/10252159727#step:7:327
8440ef6
to
6aed9e2
Compare
The entire working directory gets zipped and uploaded to AWS Batch when using the `aws-batch` runtime, which includes an early `build.log` file. When the build completes and downloads the build outputs, the early `build.log` overwrites the local `build.log` of the GH Action. This results in a truncated file that does not include the AWS_BATCH_JOB_ID and prevents the generation of the AWS Batch summary. This commit moves the hard-coded `build.log` file to an ignored directory (the same directory that we are using for the nextstrain/.github repo¹). This prevents the `build.log` from being uploaded with the build. Previously tried other paths, but they all caused errors. - "../build.log" - ".." pattern is not allowed for the `upload-artifact` action² - "~/build.log" - `hashfiles` requires files to be in the `GITHUB_WORKSPACE`³ ¹ #44 (comment) ² actions/upload-artifact#176 ³ https://docs.github.com/en/actions/learn-github-actions/expressions#hashfiles
The entire working directory gets zipped and uploaded to AWS Batch when using the `aws-batch` runtime, which includes an early `build.log` file. When the build completes and downloads the build outputs, the early `build.log` overwrites the local `build.log` of the GH Action. This results in a truncated file that does not include the AWS_BATCH_JOB_ID and prevents the generation of the AWS Batch summary. This commit moves the hard-coded `build.log` file to an ignored directory (the same directory that we are using for the nextstrain/.github repo¹). This prevents the `build.log` from being uploaded with the build. Previously tried other paths, but they all caused errors. - "../build.log" - ".." pattern is not allowed for the `upload-artifact` action² - "~/build.log" - `hashfiles` requires files to be in the `GITHUB_WORKSPACE`³ ¹ #44 (comment) ² actions/upload-artifact#176 ³ https://docs.github.com/en/actions/learn-github-actions/expressions#hashfiles
The entire working directory gets zipped and uploaded to AWS Batch when using the `aws-batch` runtime, which includes an early `build.log` file. When the build completes and downloads the build outputs, the early `build.log` overwrites the local `build.log` of the GH Action. This results in a truncated file that does not include the AWS_BATCH_JOB_ID and prevents the generation of the AWS Batch summary and the re-attachment to the AWS Batch job in subsequent `wait` jobs. This commit moves the hard-coded `build.log` file to an ignored directory (the same directory that we are using for the nextstrain/.github repo¹). This prevents the `build.log` from being uploaded with the build. Previously tried other paths, but they all caused errors. - "../build.log" - ".." pattern is not allowed for the `upload-artifact` action² - "~/build.log" - `hashfiles` requires files to be in the `GITHUB_WORKSPACE`³ ¹ #44 (comment) ² actions/upload-artifact#176 ³ https://docs.github.com/en/actions/learn-github-actions/expressions#hashfiles
Description of proposed changes
Adds a reusable workflow to run Nextstrain builds.
Currently only supports the docker, conda, and aws-batch runtimes because I didn't want to spend the extra time / add complexity to support ambient & singularity runtimes.
I took a lot of pieces from other workflows and modified them into scripts that can be called from within the workflow. If these look good, we can refactor other workflows to use these shared scripts.
Related issue(s)
Alternative of #39
Testing