state: avoid infinite recursion in JobStore #1084

radeksimko · 2022-10-07T15:26:44Z

I couldn't come up with a more light-weight test which would reproduce the bug. As mentioned in the comment, it (scheduling over 1 million jobs) takes about 3-4 minutes to finish on M1 Pro and consumes decent chunk of almost all cores + ~5-6GB of memory at peak.

The Windows GHA environment runs out of memory before the test can even finish. 🙊

There are a few options to consider:

remove the test entirely (some risk that we re-introduce the bug)
gate the test on ENV variable, such as LONG_TESTS=1
gate the test on build tags, such as -tags=longtests

We could also consider not running the test on non-empty CI ENV variable (which most CI systems set, incl. GitHub Actions), but I'd argue that it's probably not as valuable to run that test locally for most of the time either.

dbanck

We (especially you) did quite some work to reduce CI run time over the last couple of weeks. Therefore, I would instead not run this as part of our usual CI pipeline.

What do you think about running this as a nightly job, similar to the benchmarks? That would still ensure that we catch this bug before a release.

radeksimko · 2022-10-10T16:11:08Z

What do you think about running this as a nightly job, similar to the benchmarks?

We could, but that would have to be limited to linux and/or macos as it's just impossible to run that in the Windows GHA environment.

radeksimko · 2022-10-10T18:19:39Z

@dbanck I have gated the test on longtest build tag and added a nightly workflow, as suggested, PTAL

… bug

dbanck

LGTM! I would just update the name

.github/workflows/nightly-tests.yml

Co-authored-by: Daniel Banck <dbanck@users.noreply.github.com>

github-actions · 2022-11-11T03:30:05Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

radeksimko added the bug Something isn't working label Oct 7, 2022

radeksimko force-pushed the b-fix-jobstore-recursion branch 6 times, most recently from 81aa877 to f52db4a Compare October 7, 2022 16:47

radeksimko marked this pull request as ready for review October 7, 2022 17:23

radeksimko requested a review from a team as a code owner October 7, 2022 17:23

dbanck reviewed Oct 10, 2022

View reviewed changes

radeksimko force-pushed the b-fix-jobstore-recursion branch 2 times, most recently from 909a2be to f6d7476 Compare October 10, 2022 18:18

radeksimko requested a review from dbanck October 10, 2022 18:19

radeksimko force-pushed the b-fix-jobstore-recursion branch from f6d7476 to 155ded9 Compare October 10, 2022 18:20

radeksimko added this to the v0.29.3 milestone Oct 10, 2022

radeksimko added 3 commits October 10, 2022 19:25

scheduler: add (build tag gated) test to reproduce infinite recursion…

263710c

… bug

state: fix to avoid infinite recursion in JobStore

c719ed4

ci: run long tests nightly

09be5f9

radeksimko force-pushed the b-fix-jobstore-recursion branch from 155ded9 to 09be5f9 Compare October 10, 2022 18:25

dbanck approved these changes Oct 11, 2022

View reviewed changes

.github/workflows/nightly-tests.yml Outdated Show resolved Hide resolved

Update .github/workflows/nightly-tests.yml

c8ad71e

Co-authored-by: Daniel Banck <dbanck@users.noreply.github.com>

radeksimko merged commit e8397dc into main Oct 11, 2022

radeksimko deleted the b-fix-jobstore-recursion branch October 11, 2022 09:42

github-actions bot locked as resolved and limited conversation to collaborators Nov 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

state: avoid infinite recursion in JobStore #1084

state: avoid infinite recursion in JobStore #1084

radeksimko commented Oct 7, 2022 •

edited

Loading

dbanck left a comment

radeksimko commented Oct 10, 2022

radeksimko commented Oct 10, 2022

dbanck left a comment

github-actions bot commented Nov 11, 2022

state: avoid infinite recursion in JobStore #1084

state: avoid infinite recursion in JobStore #1084

Conversation

radeksimko commented Oct 7, 2022 • edited Loading

dbanck left a comment

Choose a reason for hiding this comment

radeksimko commented Oct 10, 2022

radeksimko commented Oct 10, 2022

dbanck left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 11, 2022

radeksimko commented Oct 7, 2022 •

edited

Loading