-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KubernetesJobWatcher no longer inherits from Process #11017
Merged
dimberman
merged 1 commit into
apache:master
from
astronomer:kubernetes-job-watcher-seperate
Sep 18, 2020
Merged
KubernetesJobWatcher no longer inherits from Process #11017
dimberman
merged 1 commit into
apache:master
from
astronomer:kubernetes-job-watcher-seperate
Sep 18, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
multiprocessing.Process is set up in a very unfortunate manner that pretty much makes it impossible to test a class that inherits from Process or use any of its internal functions. For this reason we decided to seperate the actual process based functionality into a class member
kaxil
approved these changes
Sep 18, 2020
dimberman
added a commit
that referenced
this pull request
Sep 21, 2020
This reverts commit 1539bd0.
dimberman
added a commit
that referenced
this pull request
Sep 21, 2020
potiuk
added a commit
to PolideaInternal/airflow
that referenced
this pull request
Sep 22, 2020
We introduced deletion of the old artifacts as this was a suspected culprit of Kubernetes Job failures. It turned out eventually that those Kubernetes Job failures were caused by the apache#11017 change, but it's good to do housekeeping of the artifacts anyway. The delete workflow action introduced in a hurry had two problems: * it runs for every fork if they sync master. This is a bit too invasive * it fails continuously after 10 - 30 minutes every time as we have too many old artifacts to delete (GitHub has 90 days retention policy so we have likely tens of thousands of artifacts to delete) * it runs every hour and it causes occasional API rate limit exhaution (because we have too many artifacts to loop trough) This PR introduces filtering with the repo, changes frequency of deletion to be 4 times a day and adds script that we are running manualy to delete those excessive artifacts now. Eventually when the number of artifacts goes down the regular job shoul delete maybe few hundreds of artifacts appearing within the 6 hours window and it should stop failing.
potiuk
added a commit
that referenced
this pull request
Sep 22, 2020
We introduced deletion of the old artifacts as this was the suspected culprit of Kubernetes Job failures. It turned out eventually that those Kubernetes Job failures were caused by the #11017 change, but it's good to do housekeeping of the artifacts anyway. The delete workflow action introduced in a hurry had two problems: * it runs for every fork if they sync master. This is a bit too invasive * it fails continuously after 10 - 30 minutes every time as we have too many old artifacts to delete (GitHub has 90 days retention policy so we have likely tens of thousands of artifacts to delete) * it runs every hour and it causes occasional API rate limit exhaustion (because we have too many artifacts to loop trough) This PR introduces filtering with the repo, changes the frequency of deletion to be 4 times a day. Back of the envelope calculation tops 4/day at 2500 artifacts to delete at every run so we have low risk of reaching 5000 API calls/hr rate limit. and adds script that we are running manually to delete those excessive artifacts now. Eventually when the number of artifacts goes down the regular job should delete maybe a few hundreds of artifacts appearing within the 6 hours window in normal circumstances and it should stop failing then.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area:Scheduler
including HA (high availability) scheduler
provider:cncf-kubernetes
Kubernetes provider related issues
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
multiprocessing.Process is set up in a very unfortunate manner
that pretty much makes it impossible to test a class that inherits from
Process or use any of its internal functions. For this reason we decided
to seperate the actual process based functionality into a class member
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.