-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit failure resolve event #604
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should improve user's perception when facing intermittent issues (i.e. during github clone ops). Nice work @darkowlzz
LGTM
That printscreen is confusing, why is there the error twice? Are we sending 2 events now instead of one? |
Maybe it could be something like |
The error in the normal event is to show what was the failure before, in case the recovery took place after some time, providing user some context about what failed. But now I don't think it's needed. Would be simpler to just say something like:
Happy to have some better messaging. |
This event is emitted in |
I'm not sure we should inject the previous failure in metadata, also do we send a 2nd message after this with the revision? Does this change means people will receive 2 info messages every time SC recovers? For example now SC errors out for each HelmChart at startup, I have 10 charts, so 10 error messages and 10 success messages at every upgrade. Will I receive 30 messages every time after this change? |
Yes, it'll result in double notifications every time there's a failure. |
ce5f508
to
b18a7d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @darkowlzz
b18a7d2
to
5921dbf
Compare
5921dbf
to
2a9ac47
Compare
When source fetch failure happens, a warning event is emitted but when resolved, there's no resolved event emitted. This change introduces a new ResultProcessor, NotifySuccess, for SummarizeAndPatch helper which consolidates all the messages to be emitted as a single event. It checks for any recovery from failure condition and emits event about the recovery. In case there's a failure recovery and a new artifact, success event message only mentions the new artifact success message. Recovery is implied. This helps avoid mixing of the event types to have K8s native events friendly events that work with event counters. The gitrepo reconciler is modified to use the new NotifySuccess ResultProcessor. A change in artifact is checked in reconcile() and a Notification object is returned, populated with the success information. When the Notification is empty/zero, and no failure recovery, no event is emitted. Failure recovery without any Notification results in an event with only the recovery information. Introduce new condition "StorageOperationFailed" for use with any failures due to storage operation. With this, all the failures are associated with some condition, which is used to detect a failure recovery and reported as notification. Signed-off-by: Sunny <darkowlzz@protonmail.com>
Update bucket, helmrepo and helmchart reconcilers to use NotifySuccess ResultProcessor and StorageOperationFailedCondition. Signed-off-by: Sunny <darkowlzz@protonmail.com>
Since the old version of the object is now kept in a separate variable, the old object can be passed to summary helper to create patch helper. Signed-off-by: Sunny <darkowlzz@protonmail.com>
Replace `StorageOperationFailedReason` with granular reasons for the failure. Update docs with `StorageOperationFailedCondition`. Signed-off-by: Sunny <darkowlzz@protonmail.com>
Signed-off-by: Sunny <darkowlzz@protonmail.com>
f13b1a1
to
73d8dff
Compare
Summarizing how things changed eventually... |
@stefanprodan please have a look at this and see if you agree with it from a UX perspective. I directed changes this way, as I felt that "polluting" the event data with conditional strings would be non beneficial to native Kubernetes Events, as it for example causes issues with keeping counts of repetitive messages. By emitting the same event again after "recovering" from a failure, we still produce a signal to a user while allowing Kubernetes to keep count. |
Putting this on hold for now. I would like to implement it in another way. |
#624 is ready, a new simpler implementation. Closing this. |
When source reconciliation failure happens, a warning event is emitted but
when resolved, there's no resolved event emitted, leaving the user uninformed
about the status of the operation.
This change checks the object status for failure conditions and
emits a normal event to notify that the failure is resolved on a
successful reconciliation. If a failure is resolved along with a new artifact,
"stored artifact" event is emitted, suppressing the failure resolve message.
New artifact event implies that the reconciliation was successful.
When using notification-controller, this appears as:
Console logs: