Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[alertmanager] Add config to disable clustering #460

Merged
merged 3 commits into from
Apr 19, 2023
Merged

[alertmanager] Add config to disable clustering #460

merged 3 commits into from
Apr 19, 2023

Conversation

ionphractal
Copy link
Contributor

When only one alertmanager instance is deployed, alertmanager_cluster_messages_queued will rise over time until it hits 4k queue limit resulting in dropping alerts (prometheus/alertmanager#1814). This PR enables operators to disable clustering for alertmanager.

@psycofdj
Copy link
Contributor

Thanks @ionphractal for your PR.

The alertmanager documentation states that empty string should be given to --cluster.listen-address to disable clustering.

Could you rework this PR in order to stick with this approach ? Remove the introduced alertmanager.cluster.enabled and rework alertmanager_ctl template to test for emptyness of alertmanager.cluster.listen_address property.

Also, the tip given by sylr in prometheus/alertmanager#1814 looks suspicious.

Given the following code in alertmanager:

https://github.com/prometheus/alertmanager/blob/258fab7cdd551f2cf251ed0348f0ad7289aee789/cmd/alertmanager/main.go#L255-L270

There should be no difference between --cluster.listen-address= and --cluster.listen-address="". Have you experienced any difference between those ?

@benjaminguttmann-avtq any thoughts on this ?

@ionphractal
Copy link
Contributor Author

Technically it should not make any difference because of the equal "=" sign followed by a space, but for clarity I added the double quotes. I tested it in our system that it works that way as well.

Regarding the alertmanager.cluster.enabled: I chose this approach as this is imho more descriptive than setting alertmanager.cluster.listen_address to empty. But I have no strong opinion on it, so if you like to have it with an empty listen_address, let me know and I'll change it accordingly. @psycofdj

@psycofdj
Copy link
Contributor

@ionphractal normally, I would agree with you. I 100% prefer a clear disable option than relying on a particular value that encodes this behaviour.

However, bosh releases are not software per say but only a packaging system. In this particular case, I prefer transparency and stick with design decisions made by underlying software when possible (even if I consider these as "poor" decisions).

Since both approaches has pros and cons, lets settle this by asking @benjaminguttmann-avtq his opinion and go in his direction.

@benjaminguttmann-avtq
Copy link
Contributor

I can understand both arguments but I would prefer the clear 'enabled' statement then having an empty string, because that could in theory happen by accident whereas the other one is a clear decision.
@ionphractal maybe you can add an ops file as well that can be used to disable the cluster mode for people to easily use that?

@psycofdj
Copy link
Contributor

so be it, lets go for alertmanager.cluster.enabled and an ops-file to disable clustering

@ionphractal
Copy link
Contributor Author

Here you go :)

@psycofdj
Copy link
Contributor

thanks @ionphractal

@psycofdj psycofdj merged commit a79d31a into cloudfoundry:master Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants