-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition for helper-pod when multiple pvc's are provisioned #154
Comments
I got two of them with |
rancher#154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
To fix the race condition caused by reusing the helper pod name. rancher#154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
To fix the race condition caused by reusing the helper pod name. rancher#154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
To fix the race condition caused by reusing the helper pod name. #154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
@stefan-kiss Good catch! Can you check if the image |
Hello, yes, i tested and i get correct results. For my project (since i needed to move forward very fast), i used for now the helper pod name (string from config) and the actual PV name (from functions arguments) , trim the result to 63 characters and remove '-' to be sure i don't hit the name/format problems. But this approach (or any other that will use a unique pod) leaves open the possibility of having pods hanged. While there are only couple of scenarios where you could have this - like the provisioner pod getting killed while the helper pod is up it still can have an impact ... I don't yet know how to fix this but i'm thinking of replacing the deferred delete pod function with a cleaup function that looks for all helper pods (by some label) and cleans them up if they are in 'Completed', or one of the failure statuses. P.S. thank you for a very fast response ! |
@stefan-kiss Label the helper pods and clean them up once completed sounds like a good idea. Though I probably won't have time to work on it until after the thanksgiving, due to the upcoming KubeCon and other projects. Feel free to raise a PR if you need it sooner. |
Sure. I'll try ... Thanks ! |
@yasker could you maybe already create a new release containing your fix #155 as Otherwise all new users will probably hit the same issue with the current Big thanks in advance and for all your great work! |
I've just merged #156 which uses PV name as the suffix for the helper pod. I will leave it there for a couple of days for testing, then I can release |
Hi, Im also hitting this so would like to see 0.0.19 ship when ready thanks |
Fixed the helper pod race condition for multiple PVCs. Issue: rancher#154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
Fixed the helper pod race condition for multiple PVCs. Issue: #154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
v0.0.19 has been released. https://github.com/rancher/local-path-provisioner/releases/tag/v0.0.19 Notice the label clean up mechanism is not there, but since it's the same as before, we can treat the labeled clean up as an enhancement and close this bug for now. Feel free to file another bug for it. |
Fixed the helper pod race condition for multiple PVCs. Issue: rancher#154 Signed-off-by: Sheng Yang <sheng.yang@rancher.com>
Hello. I think i have uncovered a bug.
If you provision multiple pv's in rapid succession the helper pod will only run for the first one.
First here will mean the helper pod that first get's created might be or not the same as the same pv that should be created.
How to reproduce:
On a single node kubernetes cluster:
pvcs.yaml
:pods.yaml
:Then check the
/opt/local-path-provisioner
path on the node:As you can all 3 pv's are being created however only one has the correct set of permissions.
The provisioner logs:
Note the two messages regarding
unable to delete the helper pod
. That's because it want created for them. However there is no creation error message because:From my understanding it seems that all 3 requests for pv provision are sent very close and the pod being named the same cannot be created multiple times. The first request that get's trough creates the pod the rest fails silently. I'm not entierly sure why the path on the node exists in any case since the helper pod does not get called.
However it's pretty clear that only one helper pod runs at a time and only one custom provisioning code (such as the one that sets the permissions) is being run.
The text was updated successfully, but these errors were encountered: