Implement download and upload operations in TES scheduler via the TES Task runner #232

BMurri · 2023-05-26T22:29:50Z

fixes #144

I ran both the blobuploadertests and blobdownloadertests and they both passed.

The new process argument skipMissingSources is temporary: it will be retired when we only call the node runner once per task, instead of 3 times.

The three new properties of NodeTask I expect are also temporary: they should be folded into the metadata requirements as the content of the current metrics.txt file is part of that larger effort.

To minimize the effects of this PR, and for simplicity's sake, we are downloading the node task runner for each task. In future work, that will be moved to a permanent start-task operation.

To obtain the self-contained/single-file published build of the node task runner, we are accessing it via a project reference in TesApi.Web. We should consider moving that out of the project and into the build pipeline.

…g build

…iles

downloads and uploads

…storage

giventocode

Thanks for the draft PR! Most of my comments circle around two high level issues:

Not fully understanding why we need to skip missing files and if that is required, make the implementation upstream in the stack wherever the node task definition is created instead of down the weeds of the runner library.
Refactoring and functionality that seems out of scope for the issue.

src/Tes.Runner/Transfer/BlobDownloader.cs

src/Tes.Runner/Transfer/BlobUploader.cs

src/Tes.Runner/Transfer/PartsProducer.cs

src/Tes.RunnerCLI/Commands/CommandFactory.cs

src/Tes.RunnerCLI/Commands/CommandHandlers.cs

src/Tes.RunnerCLI/Commands/HandlerEx.cs

src/Tes.Runner/Storage/ResolutionPolicyHandler.cs

src/Tes.RunnerCLI/Commands/CommandFactory.cs

src/CommonUtilities/Models/NodeTask.cs

src/Tes.Runner/Storage/ResolutionPolicyHandler.cs

giventocode · 2023-06-01T16:27:53Z

src/TesApi.Web/BatchScheduler.cs

            sb.AppendLinuxLine($"write_ts DownloadEnd && \\");
            sb.AppendLinuxLine($"chmod -R o+rwx $AZ_BATCH_TASK_WORKING_DIR/wd && \\");
            sb.AppendLinuxLine($"export TES_TASK_WD=$AZ_BATCH_TASK_WORKING_DIR/wd && \\");
            sb.AppendLinuxLine($"write_ts ExecutorStart && \\");
            sb.AppendLinuxLine($"docker run --rm {volumeMountsOption} --entrypoint= --workdir / {executor.Image} {executor.Command[0]}  \"{string.Join(" && ", executor.Command.Skip(1))}\" && \\");
            sb.AppendLinuxLine($"write_ts ExecutorEnd && \\");
            sb.AppendLinuxLine($"write_ts UploadStart && \\");
-            sb.AppendLinuxLine($"docker run --rm {volumeMountsOption} {MountBlobxferScriptAndMetrics(UploadFilesScriptFileName)} --entrypoint=/bin/sh {blobxferImageName} /{UploadFilesScriptFileName} && \\");
+            sb.AppendLinuxLine($"./{NodeTaskRunnerFilename} upload --file {UploadFilesScriptFileName} && \\");


A possible simpler implementation would be to create a single node task definition file with the inputs and outputs defined. The download/upload commands will only consider the relevant parts of the definition. Also, if you name the file TesTask.json you don't even need to set the --file option.

Since we are calling this three times instead of two, and because we are calling this as a replacement for blobxfer/wget, not for its ultimate purpose, it makes sense to have the files replace the upload/download scripts for now, to continue to preserve as much of the current structure as possible. When the node task runner agent takes over the primary responsibility for everything this method currently does, that will be the moment to create one file and name it TesTask.json

ashanhol · 2023-06-05T19:37:57Z

[pre-PR reading note]
Based on your description it sounds like there's a couple of follow up items that will come out of this PR around removing or stabilizing functions. Have you created issues for those?

BMurri · 2023-06-05T23:07:11Z

[pre-PR reading note] Based on your description it sounds like there's a couple of follow up items that will come out of this PR around removing or stabilizing functions. Have you created issues for those?

Each of those have either already been handled or are #238 (#239). The only other item (that I don't believe is captured above) is a better way to implement directory uploading which we should do when we start adding support for replacing the bash script we create to run in the node with code in this new node task runner and that is captured in #245

giventocode · 2023-06-06T20:43:27Z

src/Tes.RunnerCLI/Commands/CommandHandlers.cs

@@ -47,7 +48,9 @@ private static async Task ExecuteNodeContainerTaskAsync(FileInfo file, Uri docke
        {
            try
            {
-                var executor = new Executor(file.FullName, options);
+                var nodeTask = DeserializeNodeTask(file.FullName);


Do we need to wrap this with 'Environment.ExpandEnvironmentVariables'? I am not sure if the CLI will pass an expanded path here?

When the command line option creates the FileInfo object, it is parsed based on the current directory if it was not already and absolute path. That is automatically expanded by the FileInfo object when accessing the FullName property. Since we call the executor in the same directory as the file was dropped by Azure Batch, it expands correctly, There's no environment variable added to this argument in BatchScheduler since everything at this stage is all in that one directory.

BMurri added 7 commits May 23, 2023 22:03

Place published Tes.Runner CLI into TesApi.Web's scripts folder durin…

c63a976

…g build

Fix dropping published binary in non-published builds

78903e3

Upload node task runner to storage if its MD5 does not match

b105e04

Added upload of node task runner and ability to skip missing source f…

8d11657

…iles

Added ability to record upload/download metrics

79c9326

Move non-temporary arguments to NodeTask

8b7634c

Batch node script changed to call node task runner to perform

7faf14c

downloads and uploads

BMurri requested review from giventocode, ashanhol and MattMcL4475 May 26, 2023 22:29

cleanup

24a81ed

BMurri mentioned this pull request May 26, 2023

Remove blobxfer microsoft/CromwellOnAzure#669

Merged

BMurri added 3 commits May 26, 2023 16:09

Mitigate CodeQL build failure

aedaa73

fix path

93bd8a8

Request write SAS in order to write/possibly update runner binary in …

8ca3ad2

…storage

giventocode requested changes May 27, 2023

View reviewed changes

BMurri added 7 commits May 30, 2023 14:54

Move skipping files to NodeTask's Outputs and fix node runner blob path

e9ca47f

Fix node script and improve ContainerRegistryProvider.IsImagePublic()

b04ad53

Cleaup TesApi.Web project file

1eb7500

Remove memory hints (for now)

c94410c

Prevent delay of starting API while uploading node runner to storage

d8fe560

Attempt to make checking/storing the node task runner binary more rubust

05ed99e

Fix tRunner not dropping into final publish position

6ab91ea

giventocode reviewed Jun 1, 2023

View reviewed changes

BMurri marked this pull request as ready for review June 1, 2023 00:58

Incorporate PR feedback

507a9ad

giventocode reviewed Jun 1, 2023

View reviewed changes

BMurri added 3 commits June 1, 2023 12:29

Merge branch 'main' into bmurri/tes-runner-up-down-load

7446e89

Move changes to separate PR

ce5a113

Process upload directories

4a894aa

BMurri mentioned this pull request Jun 5, 2023

[#148 Implement TES Task Runner] Implement upload directory handling via an OperationsResolver in the library #245

Closed

BMurri requested a review from giventocode June 5, 2023 23:07

Merge branch 'main' into bmurri/tes-runner-up-down-load

304a967

giventocode approved these changes Jun 6, 2023

View reviewed changes

BMurri merged commit 7a3e9b7 into main Jun 6, 2023

BMurri deleted the bmurri/tes-runner-up-down-load branch June 6, 2023 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement download and upload operations in TES scheduler via the TES Task runner #232

Implement download and upload operations in TES scheduler via the TES Task runner #232

BMurri commented May 26, 2023

giventocode left a comment •

edited

Loading

giventocode Jun 1, 2023

BMurri Jun 1, 2023

ashanhol commented Jun 5, 2023

BMurri commented Jun 5, 2023

giventocode Jun 6, 2023

BMurri Jun 6, 2023

Implement download and upload operations in TES scheduler via the TES Task runner #232

Implement download and upload operations in TES scheduler via the TES Task runner #232

Conversation

BMurri commented May 26, 2023

giventocode left a comment • edited Loading

Choose a reason for hiding this comment

giventocode Jun 1, 2023

Choose a reason for hiding this comment

BMurri Jun 1, 2023

Choose a reason for hiding this comment

ashanhol commented Jun 5, 2023

BMurri commented Jun 5, 2023

giventocode Jun 6, 2023

Choose a reason for hiding this comment

BMurri Jun 6, 2023

Choose a reason for hiding this comment

giventocode left a comment •

edited

Loading