Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update drain lib to support unreachable nodes #1870

Closed
michaelgugino opened this issue Dec 10, 2019 · 3 comments · Fixed by #2165
Closed

Update drain lib to support unreachable nodes #1870

michaelgugino opened this issue Dec 10, 2019 · 3 comments · Fixed by #2165
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@michaelgugino
Copy link
Contributor

michaelgugino commented Dec 10, 2019

User Story

As a operator I would like to successfully drain unreachable nodes when deleting them to ensure PDBs are respected.

Detailed Description

Currently, if an unreachable Node has pods with local storage, if those pods pass PDB checks (eviction succeeds), they will get a deletion timestamp but will never be deleted. We should account for this by ignoring them after X time by upgrading the drain lib to kubernetes/kubectl@846b394 or newer and implementing the new option "SkipWaitForDeleteTimeoutSeconds" when the node is unreachable. If the node is ready, do not set this option or set it to 0 to disable the behavior.

Time could be configurable, I suggest 5 minutes as a starting point for now.

Anything else you would like to add:

Steps to replicate current problem:

Create a pod with local storage.
Disable the kubelet daemon/process on that particular node.
Attempt to evict/delete the pod.
Pod never goes away.

/kind feature

edit: s/unready/unreachable/

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 10, 2019
@ncdc ncdc added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Dec 11, 2019
@ncdc ncdc added this to the v0.3.0 milestone Dec 11, 2019
@hypnoglow
Copy link
Contributor

/assign

@enxebre
Copy link
Member

enxebre commented Jan 30, 2020

and implementing the new option "SkipWaitForDeleteTimeoutSeconds" when the node is unready

shouldn't this rather be when node is unreachable?

@michaelgugino
Copy link
Contributor Author

and implementing the new option "SkipWaitForDeleteTimeoutSeconds" when the node is unready

shouldn't this rather be when node is unreachable?

Yes, I think it probably should be unreachable. Unready might mean the node is able to schedule/remove pods normally, but unreachable implies the kubelet is gone/crashed/etc.

@michaelgugino michaelgugino changed the title Update drain lib to support unready nodes Update drain lib to support unreachable nodes Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants