Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement download and upload operations in TES scheduler via the TES Task runner #232

Merged
merged 23 commits into from
Jun 6, 2023

Conversation

BMurri
Copy link
Collaborator

@BMurri BMurri commented May 26, 2023

fixes #144

I ran both the blobuploadertests and blobdownloadertests and they both passed.

The new process argument skipMissingSources is temporary: it will be retired when we only call the node runner once per task, instead of 3 times.

The three new properties of NodeTask I expect are also temporary: they should be folded into the metadata requirements as the content of the current metrics.txt file is part of that larger effort.

To minimize the effects of this PR, and for simplicity's sake, we are downloading the node task runner for each task. In future work, that will be moved to a permanent start-task operation.

To obtain the self-contained/single-file published build of the node task runner, we are accessing it via a project reference in TesApi.Web. We should consider moving that out of the project and into the build pipeline.

Copy link
Contributor

@giventocode giventocode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the draft PR! Most of my comments circle around two high level issues:

  • Not fully understanding why we need to skip missing files and if that is required, make the implementation upstream in the stack wherever the node task definition is created instead of down the weeds of the runner library.
  • Refactoring and functionality that seems out of scope for the issue.

@BMurri BMurri marked this pull request as ready for review June 1, 2023 00:58
sb.AppendLinuxLine($"write_ts DownloadEnd && \\");
sb.AppendLinuxLine($"chmod -R o+rwx $AZ_BATCH_TASK_WORKING_DIR/wd && \\");
sb.AppendLinuxLine($"export TES_TASK_WD=$AZ_BATCH_TASK_WORKING_DIR/wd && \\");
sb.AppendLinuxLine($"write_ts ExecutorStart && \\");
sb.AppendLinuxLine($"docker run --rm {volumeMountsOption} --entrypoint= --workdir / {executor.Image} {executor.Command[0]} \"{string.Join(" && ", executor.Command.Skip(1))}\" && \\");
sb.AppendLinuxLine($"write_ts ExecutorEnd && \\");
sb.AppendLinuxLine($"write_ts UploadStart && \\");
sb.AppendLinuxLine($"docker run --rm {volumeMountsOption} {MountBlobxferScriptAndMetrics(UploadFilesScriptFileName)} --entrypoint=/bin/sh {blobxferImageName} /{UploadFilesScriptFileName} && \\");
sb.AppendLinuxLine($"./{NodeTaskRunnerFilename} upload --file {UploadFilesScriptFileName} && \\");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible simpler implementation would be to create a single node task definition file with the inputs and outputs defined. The download/upload commands will only consider the relevant parts of the definition. Also, if you name the file TesTask.json you don't even need to set the --file option.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are calling this three times instead of two, and because we are calling this as a replacement for blobxfer/wget, not for its ultimate purpose, it makes sense to have the files replace the upload/download scripts for now, to continue to preserve as much of the current structure as possible. When the node task runner agent takes over the primary responsibility for everything this method currently does, that will be the moment to create one file and name it TesTask.json

@ashanhol
Copy link
Contributor

ashanhol commented Jun 5, 2023

[pre-PR reading note]
Based on your description it sounds like there's a couple of follow up items that will come out of this PR around removing or stabilizing functions. Have you created issues for those?

@BMurri
Copy link
Collaborator Author

BMurri commented Jun 5, 2023

[pre-PR reading note] Based on your description it sounds like there's a couple of follow up items that will come out of this PR around removing or stabilizing functions. Have you created issues for those?

Each of those have either already been handled or are #238 (#239). The only other item (that I don't believe is captured above) is a better way to implement directory uploading which we should do when we start adding support for replacing the bash script we create to run in the node with code in this new node task runner and that is captured in #245

@BMurri BMurri requested a review from giventocode June 5, 2023 23:07
@@ -47,7 +48,9 @@ private static async Task ExecuteNodeContainerTaskAsync(FileInfo file, Uri docke
{
try
{
var executor = new Executor(file.FullName, options);
var nodeTask = DeserializeNodeTask(file.FullName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to wrap this with 'Environment.ExpandEnvironmentVariables'? I am not sure if the CLI will pass an expanded path here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the command line option creates the FileInfo object, it is parsed based on the current directory if it was not already and absolute path. That is automatically expanded by the FileInfo object when accessing the FullName property. Since we call the executor in the same directory as the file was dropped by Azure Batch, it expands correctly, There's no environment variable added to this argument in BatchScheduler since everything at this stage is all in that one directory.

@BMurri BMurri merged commit 7a3e9b7 into main Jun 6, 2023
@BMurri BMurri deleted the bmurri/tes-runner-up-down-load branch June 6, 2023 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants