-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconsider design of restart events issued when rolling Kafka Pods #10958
Comments
I have never used them directly but my guess is that Kube events are useful for some users (as you said we have them if not that many). Even the idea about integrating self-healing could rely on events in the future. |
I was thinking the Kafka, KafkaConnect, KafkaMIrrorMaker2 etc. resources |
I would also vote for option 1 of having them use the custom resource as the main object they reference. I think it is useful to be able to see the restart as an event rather than having to always look through the operator logs. |
Triaged on 9.1.2025: We should go with the option 1 and keep this issue open. |
I've started taking a look at this and as a result have changed my mind on the approach. The suggestion was to change the The fields for the Event API are in the Kubernetes docs Today we use these fields:
Given the fields above it's possible to filter events. For example I was able to use the command:
I am unsure why the field comes back as Although I do think that changing the |
I guess that it was proposed to use the
And putting Maybe we could stick with |
The Event API has for example also Indeed, not all of the fields can be filterable. But for me this is mainly a question of where do I think the people are looking -> Are people really interested in the Pod level? Are they filtering for event of typically at least 6 different pods (3 controllers + 3 brokers)? Or are they more interested in what is happening with the operand itself which might include events for all Pods and more? Note:
|
Just to mention, with the UTO proposal we decided to get rid of all KafkaTopic events and no one complained. The reasoning was that most people use the resource status, operator logs and metrics. |
@fvaleri Yeah, I know. But for the broker pods, there are some people using them. Likely not many, but we get questions, complains, and so on about them. So it is not completely unused. That said, the main complaint is that most of the restart events are Pod revision changed which is not a good user-facing restart reason and of course, the possible change I suggested here does absolutely nothing about that. |
Currently, when the Kafka pods are rolled, we issue Kubernetes Events describing the reason for the restart. It is done only for the Kafka, Connect and MM2 node restarts. The events are issued to the Pods as the main objects.
This approach has several issues:
Pod has old revision
which means that the Pod definition has changed -> but the root cause of the change could be for example updated listener certificate or something similar.I think we should consider the future of the events used by the operator and two options for how to deal with them come to my mind:
regarding
field). That would make it easier to find the events as the custom resource will have only our events and not the events related to the Pod lifecycle. The Pod might be referenced as therelated
resource if needed. Issuing the events to the custom resource might also make it easier to consider other situations when we might want to issue events.The text was updated successfully, but these errors were encountered: