-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement worker polling backoff for REST API client #370
Comments
OK, both this feature and the gRPC one in #366 need further work. It works as expected when the auth strategy is set to NONE, but when passed an invalid secret, the backoff actually needs to take place in the token provider code. There is backoff logic in there, but it doesn't have test coverage - yet. |
OK, here is the problem: The OAuth provider makes a debounced token request, and throws asynchronously when credentials are invalid. Currently, this doesn't propagate to the worker polling code that ultimately called the token request. This means that neither the backoff is activated in the exception handler nor the response is returned in the success handler, and the polling lock is not released - stalling the worker perpetually. The token debounce was implemented to stop exactly the same scenario, but in a different part of the system: misconfigured credentials sent to the token endpoint should not be retried immediately - rather they should back off subsequent requests - to avoid DOS of the token endpoint. The interaction between the two mechanisms needs to be worked out. Valid credentials may be exchanged for a token, but the polling call may be denied due to not having a token that is valid for Zeebe. Or the credentials may be invalid and no token returned. So there are two endpoints that need to back off independently involved in a worker polling call. |
OK, the two backoffs compound. The token endpoint backoff is linear: The backoff on the worker poll is also linear, but a steeper curve: In the case where the error is due to not being able to get a token from the token endpoint, the backoff is unbounded. The worker poll backoff is bounded by the setting of I have bounded the token endpoint backoff to 15s. So in the case that a token cannot be secured, the following happens: The OAuthTokenProvider throws a failure, and starts backing off. The worker catches this throw, throws a polling failure, and starts backing off. The token provider will back off up to 10s, and the worker will back off by up to 15s (by default). So by default, in a failure state of invalid credentials, the combined backup will be 30s. I have added log warning messages in both the token provider and the worker to let the user know what is going on. |
A recent customer support issue highlighted the scenario of a worker with expired or misconfigured credentials hosing the gateway with unrelenting poll requests. See #366.
As a consequence, I implemented backoff on 16 UNAUTHENTICATED for the gRPC worker.
This now needs to be implemented for the REST API as well.
The text was updated successfully, but these errors were encountered: