Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize BatchScheduler #142

Closed
wants to merge 87 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
6c6345a
Parallelize BatchScheduler
BMurri Mar 21, 2023
7858736
Missed filter
BMurri Mar 21, 2023
1310f1c
Log to debug. These two entries should be removed before merging
BMurri Mar 21, 2023
a0d29fa
Completed making BatchScheduler multi-threaded and consolidated retry
BMurri Mar 23, 2023
bf28454
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Mar 23, 2023
f2ee8ef
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Mar 23, 2023
abf22b2
Added CancelationToken to all places touched, directly or indirectly,
BMurri Mar 24, 2023
5ee850f
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Mar 24, 2023
2a5c7dc
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Mar 31, 2023
27542d9
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 3, 2023
8698ca6
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 3, 2023
e5ab188
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 4, 2023
3382714
Prevent CloudException from failing task during pool creation
BMurri Apr 5, 2023
c07ab9a
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 5, 2023
b3a091a
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 6, 2023
4726da0
Fix incorrect resource cleanup and make code layout more consistent
BMurri Apr 7, 2023
67e4b93
Address issues discovered when investigating failed unit tests
BMurri Apr 7, 2023
81c5671
Fix remaining tests and include exception type in reports
BMurri Apr 7, 2023
f16b05b
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 7, 2023
4cf7cda
Batch is suddenly unable to remove unusable nodes when using ComputeN…
BMurri Apr 10, 2023
989acf2
Don't retain any on-compute-node task artifacts
BMurri Apr 10, 2023
04d7036
Make calling batch api more robust, reduce batch calls for pool
BMurri Apr 11, 2023
9758cbd
Lowercase the hashes in pool/job names
BMurri Apr 11, 2023
995a4f7
minor edits
BMurri Apr 11, 2023
63294c7
Make calling batch api more robust
BMurri Apr 11, 2023
dd999ae
Seeking more clarification on remaining errors
BMurri Apr 11, 2023
c974178
Don't delete unavailable pool until all tasks have been fully processed
BMurri Apr 12, 2023
59c3862
Only check core quotas when they are zero, because auto-scaled pools
BMurri Apr 12, 2023
2995053
Fix tests breakage
BMurri Apr 12, 2023
e0aa687
Remove unused code
BMurri Apr 12, 2023
52e5162
Retry logic on the tes task repository
BMurri Apr 12, 2023
5eda69c
Extend cancelability through IRepository<T>
BMurri Apr 12, 2023
a3eecda
missed cases
BMurri Apr 12, 2023
d7ec484
Increase Repository max retry count
BMurri Apr 12, 2023
d81a0a5
Prevent transient CloudException from failing task during pool/job cr…
BMurri Apr 13, 2023
9d6b64a
whitespace in log
BMurri Apr 13, 2023
95b068f
Batch is suddenly unable to remove unusable nodes when using ComputeN…
BMurri Apr 13, 2023
d12fdd6
Lowercase the hashes in pool/job names
BMurri Apr 13, 2023
4dd90f7
Delay deletion of unavailable pool until all tasks are fully processed
BMurri Apr 13, 2023
0e5f851
When using auto-scaled pools, full core quota checks aren't needed
BMurri Apr 13, 2023
d383804
Make intent clearer
BMurri Apr 13, 2023
f5c53c7
whitespace
BMurri Apr 13, 2023
c5c8e8d
Retry logic for the tes task repository
BMurri Apr 13, 2023
7013d48
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 13, 2023
0a0c71c
Merge branch 'main' into bmurri/prevent-pool-create-cloudexc-failure
BMurri Apr 13, 2023
8790a2a
Merge branch 'main' into more-effective-node-removal
BMurri Apr 13, 2023
ff188c0
Merge branch 'main' into bmurri/pool-job-name-hashes-lowercase
BMurri Apr 13, 2023
4c70c73
Merge branch 'main' into bmurri/delay-pool-deletion-completed-task-pr…
BMurri Apr 13, 2023
30f73b6
Merge branch 'main' into bmurri/minimal-core-quota-checks
BMurri Apr 13, 2023
91dff47
Merge branch 'main' into bmurri/add-retries-to-irepository
BMurri Apr 13, 2023
fb13ffd
Merge branch 'bmurri/prevent-pool-create-cloudexc-failure' into bmurr…
BMurri Apr 13, 2023
993be5a
Merge branch 'more-effective-node-removal' into bmurri/parallelize-sc…
BMurri Apr 13, 2023
88b72eb
Merge branch 'bmurri/pool-job-name-hashes-lowercase' into bmurri/para…
BMurri Apr 13, 2023
eabd04a
Merge branch 'bmurri/delay-pool-deletion-completed-task-processing' i…
BMurri Apr 13, 2023
73da1cf
Merge branch 'bmurri/minimal-core-quota-checks' into bmurri/paralleli…
BMurri Apr 13, 2023
4407e4f
Merge branch 'bmurri/add-retries-to-irepository' into bmurri/parallel…
BMurri Apr 13, 2023
e19ff96
Numerous small bug fixes, dead code removal, formatting consistency, …
BMurri Apr 13, 2023
9055a53
Merge branch 'bmurri/misc-fixes' into bmurri/parallelize-scheduler
BMurri Apr 13, 2023
aa2d150
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 13, 2023
f4a5942
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 13, 2023
75a85cd
Update a couple of values
BMurri Apr 14, 2023
929986a
Clarity
BMurri Apr 14, 2023
3236987
Batch TesTask DB updates
BMurri Apr 14, 2023
7cdbbd0
Change repository cache to cache db items
BMurri Apr 24, 2023
cd3f3e2
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 24, 2023
e4938b3
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 24, 2023
02fd651
Formatting/whitespace and usings
BMurri Apr 25, 2023
90691f5
Finish making continuation token task retrieval work
BMurri Apr 25, 2023
28ae82e
Streamline schedualer task handling
BMurri Apr 25, 2023
1649916
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Apr 26, 2023
a8bd2a8
Various minor issues
BMurri Apr 27, 2023
b31f0f6
Handle ArgumentExceptions thrown in BatchScheduler.WhenEach()
BMurri Apr 28, 2023
015c1d5
Update TesRepositoryLazyCache.cs
BMurri Apr 28, 2023
e2db1c6
Change cache provider
BMurri Apr 28, 2023
7d7b482
minor, yet necessary, fix
BMurri Apr 28, 2023
fcbffd9
Discovered that the caching code wasn't caching
BMurri Apr 29, 2023
5eea94f
Change terminal item expiration
BMurri Apr 29, 2023
03c832f
fix test
BMurri May 1, 2023
fbcbbe4
Merge branch 'bmurri/parallelize-scheduler' of https://github.com/mic…
BMurri May 1, 2023
665d494
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri May 2, 2023
4c9a9c9
Missed edit in merge
BMurri May 2, 2023
ef537ca
Cache optimization
BMurri May 2, 2023
c7970d2
Clean database
BMurri May 2, 2023
7697b5f
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Jun 29, 2023
433d84a
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Aug 29, 2023
8be736e
Merge branch 'main' into bmurri/parallelize-scheduler
BMurri Sep 23, 2023
649d44c
fix whitespace
BMurri Sep 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/dot_net_format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@ jobs:
dotnet-format:
runs-on: ubuntu-latest

steps:
steps:
- name: Azure login
uses: azure/login@v1
with:
creds: '${{ secrets.AZURE_CREDENTIALS }}'

- name: Check out code
uses: actions/checkout@v3.3.0

- name: Remove nuget.config
uses: JesseTG/rm@v1.0.3
with:
Expand Down
6 changes: 3 additions & 3 deletions format/Setup Pre-commit.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Set up Pre-commit
To set up dotnet format to run locally on 'git commit', you need to perform the following actions. You should only need to do this once unless you remove and clone the repo again.
To set up dotnet format to run locally on 'git commit', you need to perform the following actions. You should only need to do this once unless you remove and clone the repo again.

## Git pre-commit hook to reformat
Navigate to GIT-REPO-LOCATION/format. Run the "dotnet-format-pre-commit.cmd" to copy the pre-commit file into the GIT-REPO-LOCATION/.git/hooks directory. Verify that the file has been copied as expeted into GIT-REPO-LOCATION/.git/hooks. The cmd script assumes it is being run from the **GIT-REPO-LOCATION/format** directory

```
cd GIT-REPO-LOCATION/format
.\dotnet-format-pre-commit.cmd
```

After running this, when you go to commit, dotnet format should run automatically and correct any formatting that needs attention.
After running this, when you go to commit, dotnet format should run automatically and correct any formatting that needs attention.
44 changes: 19 additions & 25 deletions src/CommonUtilities/Base32.cs
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ namespace CommonUtilities
{
public static class Base32
{
private static readonly char[] Rfc4648Base32 = new[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '2', '3', '4', '5', '6', '7' };
private const int GroupBitlength = 5;
private const int BitsPerByte = 8;
private const int GroupBitlength = 5;
private const int LargestBitPosition = GroupBitlength - 1;

private static readonly char[] Rfc4648Base32 = new[] { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '2', '3', '4', '5', '6', '7' };
private static readonly string[] Rfc4648Base32Suffix = new[] { string.Empty, @"======", @"====", @"===", @"=" };

/// <summary>
/// Converts binary to Base32
/// </summary>
Expand All @@ -23,48 +25,40 @@ public static string ConvertToBase32(byte[] bytes) // https://datatracker.ietf.o
=> new string(new BitArray(bytes).Cast<bool>()

// Reverse each byte's bits to convert the stream from LSB to MSB
.ConvertGroup(BitsPerByte,
.ConvertByBatch(BitsPerByte,
(bit, _) => bit,
(bits) => bits.Reverse())
.SelectMany(b => b)

// Convert each 5-bit group in the stream into its final character
.ConvertGroup(GroupBitlength,
.ConvertByBatch(GroupBitlength,
(bit, index) => bit ? 1 << LargestBitPosition - index : 0,
(values) => Rfc4648Base32[values.Sum()])
.ToArray())

// Append suffix
+ (bytes.Length % GroupBitlength) switch
{
0 => string.Empty,
1 => @"======",
2 => @"====",
3 => @"===",
4 => @"=",
_ => throw new InvalidOperationException(), // Keeps the compiler happy.
};
+ Rfc4648Base32Suffix[bytes.Length % GroupBitlength];

/// <summary>
/// Converts each group (fixed number) of items into a new item
/// Converts each batch (fixed number) of items from an emumeration into a new item
/// </summary>
/// <typeparam name="TSource">Type of source items</typeparam>
/// <typeparam name="TGroupItem">Intermediate type</typeparam>
/// <typeparam name="TBatchMemberItem">Intermediate type</typeparam>
/// <typeparam name="TResult">Type of the resultant items</typeparam>
/// <param name="ts">The source enumerable of type <typeparamref name="TSource"/>.</param>
/// <param name="itemsPerGroup">The size of each group to create out of the entire enumeration. The last group may be smaller.</param>
/// <param name="groupMemberFunc">The function that prepares each <typeparamref name="TSource"/> into the value expected by <paramref name="groupResultFunc"/>. Its parameters are an item of type <typeparamref name="TSource"/> and the index of that item (starting from zero) within each group.</param>
/// <param name="groupResultFunc">The function that creates the <typeparamref name="TResult"/> from each group of <typeparamref name="TGroupItem"/>.</param>
/// <returns>An enumeration of <typeparamref name="TResult"/> from all of the groups.</returns>
private static IEnumerable<TResult> ConvertGroup<TSource, TGroupItem, TResult>(
/// <param name="itemsPerBatch">The size of each batch to create out of the entire enumeration. The last batch may be smaller.</param>
/// <param name="sourceToBatchMemberConverter">The function that prepares each <typeparamref name="TSource"/> into the type expected by <paramref name="batchToResultConverter"/>. Its parameters are an item of type <typeparamref name="TSource"/> and the <see cref="Int32"/> index of that item (starting from zero) within each group.</param>
/// <param name="batchToResultConverter">The function that creates the <typeparamref name="TResult"/> from each batch of <typeparamref name="TBatchMemberItem"/>.</param>
/// <returns>An enumeration of <typeparamref name="TResult"/> from all of the batches.</returns>
public static IEnumerable<TResult> ConvertByBatch<TSource, TBatchMemberItem, TResult>(
this IEnumerable<TSource> ts,
int itemsPerGroup,
Func<TSource, int, TGroupItem> groupMemberFunc,
Func<IEnumerable<TGroupItem>, TResult> groupResultFunc)
int itemsPerBatch,
Func<TSource, int, TBatchMemberItem> sourceToBatchMemberConverter,
Func<IEnumerable<TBatchMemberItem>, TResult> batchToResultConverter)
=> ts
.Select((value, index) => (Index: index, Value: value))
.GroupBy(tuple => tuple.Index / itemsPerGroup)
.GroupBy(tuple => tuple.Index / itemsPerBatch)
.OrderBy(tuple => tuple.Key)
.Select(items => groupResultFunc(items.Select(i => groupMemberFunc(i.Value, i.Index % itemsPerGroup))));
.Select(items => batchToResultConverter(items.Select(i => sourceToBatchMemberConverter(i.Value, i.Index % itemsPerBatch))));
}
}
8 changes: 6 additions & 2 deletions src/Tes.ApiClients/HttpApiClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,8 @@ protected HttpApiClient() { }
/// <returns></returns>
protected async Task<HttpResponseMessage> HttpSendRequestWithRetryPolicyAsync(
Func<HttpRequestMessage> httpRequestFactory, CancellationToken cancellationToken, bool setAuthorizationHeader = false)
=> await cachingRetryHandler.ExecuteWithRetryAsync(async ct =>
{
return await cachingRetryHandler.ExecuteWithRetryAsync(async ct =>
{
var request = httpRequestFactory();
if (setAuthorizationHeader)
Expand All @@ -88,6 +89,7 @@ protected async Task<HttpResponseMessage> HttpSendRequestWithRetryPolicyAsync(

return await HttpClient.SendAsync(request, ct);
}, cancellationToken);
}

/// <summary>
/// Sends a Http Get request to the URL and returns body response as string
Expand Down Expand Up @@ -153,13 +155,15 @@ protected async Task<string> HttpGetRequestWithCachingAndRetryPolicyAsync(Uri re
/// <returns></returns>
protected async Task<string> HttpGetRequestWithRetryPolicyAsync(Uri requestUrl,
CancellationToken cancellationToken, bool setAuthorizationHeader = false)
=> await cachingRetryHandler.ExecuteWithRetryAsync(async ct =>
{
return await cachingRetryHandler.ExecuteWithRetryAsync(async ct =>
{
//request must be recreated in every retry.
var httpRequest = await CreateGetHttpRequest(requestUrl, setAuthorizationHeader, ct);

return await ExecuteRequestAndReadResponseBodyAsync(httpRequest, ct);
}, cancellationToken);
}

/// <summary>
/// Returns an query string key-value, with the value escaped. If the value is null or empty returns an empty string
Expand Down
6 changes: 5 additions & 1 deletion src/Tes/Repository/TesDbContext.cs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,11 @@ protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
// use PostgreSQL
optionsBuilder
.UseNpgsql(ConnectionString, options => options.MaxBatchSize(1000))
.UseNpgsql(ConnectionString, options =>
{
options.EnableRetryOnFailure();
options.MaxBatchSize(1000);
})
.UseLowerCaseNamingConvention();
}
}
Expand Down
6 changes: 6 additions & 0 deletions src/Tes/Tes.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

<ItemGroup>
<PackageReference Include="EFCore.NamingConventions" Version="7.0.2" />
<PackageReference Include="LinqKit.Microsoft.EntityFrameworkCore" Version="7.1.4" />
<PackageReference Include="Microsoft.EntityFrameworkCore" Version="7.0.3" />
<PackageReference Include="Microsoft.EntityFrameworkCore.Design" Version="7.0.3">
<PrivateAssets>all</PrivateAssets>
Expand All @@ -19,6 +20,11 @@
<PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL" Version="7.0.3" />
<PackageReference Include="Polly" Version="7.2.3" />
<PackageReference Include="Polly.Extensions.Http" Version="3.0.0" />
<PackageReference Include="System.Linq.Async" Version="6.0.1" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\CommonUtilities\CommonUtilities.csproj" />
</ItemGroup>

</Project>
Loading