Skip to content

Detecting cluster failover by application #13359

Closed Locked Answered by michaelklishin
dev4342345235 asked this question in Questions
Discussion options

You must be logged in to vote

@dev4342345235 none of those. GET /api/aliveness-test/{vhost} is a no-op. The other two health checks have nothing to do with whether nodes are up or not, as their description suggests.

When a majority of nodes is down, all quorum queue and stream operations will fail, and so will nearly all client operations in general when Khepri is used. This should be a good enough indication.

It don't subscribe to the opinion that applications should be monitoring cluster state. Monitoring systems should. Having a majority of nodes down and switching between clusters is not something you will do every day, it's not at all comparable to client connection recovery.

That's a job for the dedicated monito…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@dev4342345235
Comment options

@michaelklishin
Comment options

@dev4342345235
Comment options

@michaelklishin
Comment options

Answer selected by dev4342345235
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants