-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement] Race condition for being selectable when some nodes in the cluster are down #2520
Comments
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
Matiszak
added a commit
to Matiszak/StackExchange.Redis
that referenced
this issue
Aug 16, 2023
…scovery (StackExchange#2520) Sending tracers is not necessary because we just connected to the nodes a few seconds ago. It causes problems because sent tracers do not have enough time to respond.
I made a PR #2525 which seem to fix the problem perfectly in my case. Additionally I found two more scenarios that manifest this issue:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context:
We have a cluster consisting of 3 masters + 3 replicas (1 replica for each master). We are simulating situation when some of our cluster is down (in this example we shut down 3 replicas). Our .NET clients have all of the 6 nodes put into configuration. Also option "resolveDns=true" is set. (resolveDns is not crucial here, it's just because nodes know themselves through their ips so that I set it to have not duplicated connections by hostname and by ip after discovery).
When some of the nodes in the cluster are down they are not responsive. So they exhaust default connection timeout of 5000 milliseconds. They exhaust even much higher connection timeouts (tens of seconds). It was noticed in previous versions of StackExchange.Redis (<2.2.xx) that this behavior caused problems because after .Connect() randomly some of the nodes were unselectable by reason DidNotRespond. Upgrading to 2.6.xx mostly solved this issue (for which im glad) due to queueing feature. But I noticed that the underlying issue is still there and in some scenarios might cause problems so that's the reason I'm posting this. I found one use case in which it manifests but you might be aware of more.
Issue:
sendTracerIfConnected: true
inserver.OnConnectedAsync(log, sendTracerIfConnected: true, autoConfigureIfConnected: reconfigureAll);
Reproduction code:
Exception:
Log:
The text was updated successfully, but these errors were encountered: