Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KIP-601: Change the authentication timeout to 30 seconds #1332

Closed
aloknnikhil opened this issue Apr 25, 2022 · 8 comments
Closed

KIP-601: Change the authentication timeout to 30 seconds #1332

aloknnikhil opened this issue Apr 25, 2022 · 8 comments

Comments

@aloknnikhil
Copy link

Describe the bug
The default SASL authentication timeout is set to 1 second. This is not a good default since it'll result in a lot of reconnects until the Broker can service the auth request in under 1 second. Can we bump this up to 30 seconds instead? This is more in-line with the official Kafka clients.

To Reproduce

  • Attempt to connect to a broker that is heavily loaded / throttled.

Expected behavior

  • The client should wait 30 seconds before timing out an auth request before attempting to connect again. Or provide a deferred retry approach that starts with a small wait time and progressively waits longer with each retry.

Observed behavior

  • The client waits only 1 second and any slowness (in the network / broker) trips the client to attempt another connection.
@Nevon
Copy link
Collaborator

Nevon commented Apr 26, 2022

What's the property of the corresponding Java client that governs this? I can't find any reference to it in the docs, nor could I find anything for librdkafka.

@lbradstreet
Copy link

Kafka java clients does not have specific authentication timeout. Instead we have connection setup timeout (includes tcp connection setup as well SSL/SASL handshake). default is 10 seconds with the connection setup timeout increasing to a maximum of 30 seconds. See:

librdkafka also has similar config socket.connection.setup.timeout.ms
https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md with default 30 seconds. This appears to be equivalent to connection-timeout in KafkaJS : https://kafka.js.org/docs/configuration#connection-timeout, which has default 1 second

connectionTimeout = 1000,

@Nevon
Copy link
Collaborator

Nevon commented May 2, 2022

socket.connection.setup.timeout.ms would not be equivalent to connectionTimeout, since socket.connection.setup.timeout.ms includes the SASL authentication as well. In KafkaJS, connectionTimeout is just the socket connection timeout - nothing to do with authentication. The equivalent would be connectionTimeout + authenticationTimeout. Both of those currently default to 1000ms, and I do agree that it's probably reasonable to increase the default for authenticationTimeout.

@aloknnikhil
Copy link
Author

Yes. I think we can start by bumping that up. If you think implementing a deferred retry policy is easy, we can start with a smaller timeout of say 5 seconds and then gradually increase it up to 30 seconds. But if a static retry policy is easier, let's do 30 seconds. Wdyt @Nevon?

@Nevon Nevon changed the title Change the authentication timeout to 30 seconds KIP-601: Change the authentication timeout to 30 seconds May 3, 2022
@Nevon
Copy link
Collaborator

Nevon commented May 3, 2022

I actually already bumped it to a static 10s, and forgot to reference this issue. My reasoning is in #1340, but essentially if we set it to 30s now it'll be more difficult to implement a retry policy later, because we'll need to lower the initial timeout value which will could cause issues for some folks. We also have to factor in the connectionTimeout, so in reality the total timeout value is 12s by default.

I'm gearing up to release v2.0.0, so I didn't want to introduce a major change at this point, like a retry policy with exponentially longer timeouts, so this seemed like a reasonable compromise. Do you agree that this issue can be closed with that change?

@ijuma
Copy link

ijuma commented Dec 21, 2022

@Nevon I think you can close this issue given the increase from 1s to 10s.

@Nevon Nevon closed this as completed Dec 21, 2022
@ronakheliwal
Copy link

ronakheliwal commented Aug 24, 2023

Hi @Nevon , We are using kafkajs with Confluent cloud. The have some cluster rolls activities where they might take some broker out of the cluster for some maintenance.

During one of such cluster rolls we saw a lots of following errors

  • Connection error: Client network socket disconnected before secure TLS connection was established
  • Failed to connect to seed broker, trying another broker from the list: Connection timeout

As per the mechanics of a roll, the connections to the broker being bounced were dropped
(which is expected). Subsequently, coupled with the high connection count and low client side
connection timeout, the cloud cluster started experiencing a connection storm (an influx of
backlogged connections that overwhelmed the brokers).

We are using latest version 2.2.4 with default values of connectionTimeout=1sec and authenticationTimeout=10sec.
Confluent support is suggesting connection establishment timeout of 30sec.

Since in kafkajs we have different timeouts for connection and authentication, what values of connectionTimeout and authenticationTimeout do you suggest for overall connection establishment timeout of 30 sec.

Looking forward for your suggestion here. Thanks in advance :)

@0xPT
Copy link

0xPT commented Sep 10, 2024

You guys should update the docs to reflect the updated information.

https://kafka.js.org/docs/configuration#connection-timeout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants