-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make disk full errors more prominent #330
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
return stdout, err | ||
} | ||
r.VRec.Eventf(r.Vdb, corev1.EventTypeNormal, events.NodeRestartSucceeded, | ||
"Successfully called 'admintools -t restart_node' and it took %ds", int(elapsedTimeInSeconds)) | ||
return stdout, nil | ||
} | ||
|
||
// logATFailureEvent will log k8s events for the admintools restart error. | ||
func (r *RestartReconciler) logATFailureEvent(atCmd, op string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to use this for other admintools commands? Like if there is a need to have the same kind of function for another reconciler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bunch of parsing we do for create_db. I debated about throwing everything into a new package that they both use. Do you think I should do that cleanup? It will probably make things easier when we move away from admintools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I do like that idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to do that on this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll include it in this PR. I'm working on this change now.
VRec *VerticaDBReconciler | ||
Vdb *vapi.VerticaDB | ||
PFacts *PodFacts | ||
NumEvents int // Number of events written |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you use this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the unit test to prove that we recorded the correct number of events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
ctrl "sigs.k8s.io/controller-runtime" | ||
) | ||
|
||
type ATErrors struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment for this struct? Same for some functions below.
pkg/mgmterrors/aterrors.go
Outdated
RestartNodesNotDownRequeueWaitTimeInSeconds = 10 | ||
) | ||
|
||
// LogFailureError is called when admintools had attempted an option but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function's name is LogFailure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This was an early name of this function. I forget to change this.
This change makes disk full errors more prominent in the operator. The focus is on checking the disk space for the local PV. We handle this with a couple of strategies:
I did the second approach because there are some times when vertica may have trouble starting and the error doesn't surface in the admintools output.