Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test K3s, Kubernetes, and MicroK8s workflow periodically failing #322

Closed
kate-goldenring opened this issue Apr 26, 2021 · 2 comments
Closed
Labels
bug Something isn't working stale

Comments

@kate-goldenring
Copy link
Contributor

Describe the bug
A clear and concise description of what the bug is.
Akri's workflow to test Akri on K3s, Kubernetes, and MicroK8s periodically fails due to GitHub Runners running out of compute resources and the kubelet evicting Pods. The failure rate has been reduced by using release builds in PRs (#301), reducing the size of Debug Echo brokers by using nginx:stable-alpine (#301), reducing compute resource eviction limits in the MicroK8s kubelet (#301) and the K3s kubelet (#313).

Also, tests are being done with unshared debug echo devices to reduce the length of time the tests run by 5 minutes (shared devices get a 5 min offline grace period) and therefore the chance of failure. This should be changed back to using shared once the failures are resolved as it is the more complex scenario.

Potential solution
Akri should host its own workflow runners that have more disk and RAM.

GitHub runners have the following hardware specifications for Linux virtual machines:
2-core CPU
7 GB of RAM memory
14 GB of SSD disk space

However, MicroK8's documentation states that "At least 20G of disk space and 4G of memory are recommended."

@kate-goldenring kate-goldenring added the bug Something isn't working label Apr 26, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2021

Issue has been automatically marked as stale due to inactivity for 45 days. Update the issue to remove label, otherwise it will be automatically closed.

@github-actions github-actions bot added the stale label Sep 3, 2021
@bfjelds
Copy link
Collaborator

bfjelds commented Nov 2, 2021

Workaround: rerunning job typically resolves the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
Status: Done
Development

No branches or pull requests

2 participants