Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose time range options for log collection in VerticaScrutinize CR #885

Merged
merged 33 commits into from
Sep 6, 2024

Conversation

qindotguan
Copy link
Collaborator

This PR defines 3 new parameters in the VerticaScrutinize CR which allow the collection of an arbitrary time range of logs:

  • logAgeHours: maximum age in hours of logs
  • logAgeOldestTime: maximum timestamp for logs
  • logAgeNewestTime: minimum timestamp for logs

Webhook added to validate:

  • logAgeHours cannot be set alongside logAgeOldestTime and logAgeNewestTime.
  • logAgeOldestTime should not be ahead of the current time.
  • logAgeOldestTime should be ahead of logAgeNewestTime.
  • logAgeOldestTime and logAgeNewestTime should be formatted as YYYY-MM-DD HH [+/-XX]

Comment on lines 48 to 70
// +kubebuilder:validation:Optional
// +operator-sdk:csv:customresourcedefinitions:type=spec
// In order to facilitate diagnosing less recent problems, scrutinize should be able to collect an arbitrary time range of logs.
// With no options set, archived logs gathered for each node should be no older than 24 hours, and as recent as now.
// With the oldest time param or log age set, no archives of vertica.log should be older than that time.
// Timestamps should be formatted as: YYYY-MM-DD HH [+/-X], where [] is optional and +X represents X hours ahead of UTC.
LogAgeOldestTime string `json:"logAgeOldestTime,omitempty"`

// +kubebuilder:validation:Optional
// +operator-sdk:csv:customresourcedefinitions:type=spec
// In order to facilitate diagnosing less recent problems, scrutinize should be able to collect an arbitrary time range of logs.
// With no options set, archived logs gathered for each node should be no older than 24 hours, and as recent as now.
// With the newest time param set, no archives of vertica.log should be newer than that time.
// Timestamps should be formatted as: YYYY-MM-DD HH [+/-X], where [] is optional and +X represents X hours ahead of UTC.
LogAgeNewestTime string `json:"logAgeNewestTime,omitempty"`

// +kubebuilder:validation:Optional
// +operator-sdk:csv:customresourcedefinitions:type=spec
// In order to facilitate diagnosing less recent problems, scrutinize should be able to collect an arbitrary time range of logs.
// With no options set, archived logs gathered for each node should be no older than 24 hours, and as recent as now.
// The hours param cannot be set alongside the Time options, and if attempted, should issue an error indicating so.
LogAgeHours int `json:"logAgeHours,omitempty"`

Copy link
Collaborator

@roypaulin roypaulin Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use annotations:

  • vertica.com/scrutinize-log-age-oldest-time
  • vertica.com/scrutinize-log-age-newest-time
  • vertica.com/scrutinize-log-age-hours

see package/meta/annotations.go

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Moved these parameters from spec to annotations.

Comment on lines 129 to 143
// +operator-sdk:csv:customresourcedefinitions:type=status
// +optional
// minimum timestamp for logs (default 24 hours ago)
LogAgeOldestTime string `json:"logAgeOldestTime,omitempty"`

// +operator-sdk:csv:customresourcedefinitions:type=status
// +optional
// maximum timestamp for logs (default none)
LogAgeNewestTime string `json:"logAgeNewestTime,omitempty"`

// +operator-sdk:csv:customresourcedefinitions:type=status
// +optional
// maximum timestamp for logs (default none)
LogAgeHours int `json:"logAgeHours,omitempty"`

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -118,6 +121,37 @@ func (vscr *VerticaScrutinize) CopyAnnotations() map[string]string {
return annotations
}

// GenerateLogAgeTime returns a string in the format of YYYY-MM-DD HH [+/-XX]
func (vscr *VerticaScrutinize) GenerateLogAgeTime(hourOffset time.Duration, timeZone string) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not need to be exported and does not need a receiver. It can just be a simple function because it is only used for testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the receiver, but it needs to be exported as it's called in pkg/controllers/vscr/scrutinizepod_reconciler_test.go.

}

// ParseLogAgeTime converts YYYY-MM-DD HH [+/-XX] into time format in UTC
func (vscr *VerticaScrutinize) ParseLogAgeTime(logAgeTime string) time.Time {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return an error if time could not be parsed.

Also, this function doesn't need a receiver.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added error handling and moved it to api/v1beta1/verticascrutinize_webhook.go as a simple function.

logAgeOldestTime := twentyFourHoursAgo

if vscr.Spec.LogAgeOldestTime != "" {
logAgeOldestTime = vscr.ParseLogAgeTime(vscr.Spec.LogAgeOldestTime)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test specifically for this function. You need to return value is as expected.

Comment on lines 162 to 164
if vscr.Spec.LogAgeHours != 0 && (vscr.Spec.LogAgeOldestTime != "" || vscr.Spec.LogAgeNewestTime != "") {
s.Log.Info("--log-age-hours cannot be set alongside the *-time options")
return cmd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this, there is already a webhook rule.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines 169 to 179
if vscr.Spec.LogAgeHours != 0 {
cmd = append(cmd, "--log-age-hours", strconv.Itoa(vscr.Spec.LogAgeHours))
} else {
if vscr.Spec.LogAgeOldestTime != "" {
cmd = append(cmd, "--log-age-oldest-time", vscr.Spec.LogAgeOldestTime)
}
if vscr.Spec.LogAgeNewestTime != "" {
cmd = append(cmd, "--log-age-newest-time", vscr.Spec.LogAgeNewestTime)
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any of these flags was added in 24.3.0+, we should check that the server version is 24.3.0+ before adding them to the command.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added supported version 24.2 in version.go and check in pkg/controllers/vscr/vdbverify_reconciler.go.

@@ -164,6 +187,7 @@ func (s *ScrutinizeCmdArgs) buildScrutinizeCmdArgs(vdb *v1.VerticaDB) []string {
// container and have scrutinize read the password from the mounted file
cmd = append(cmd, "--password-file", paths.ScrutinizeDBPasswordFile)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -147,13 +149,34 @@ func (s *ScrutinizePodReconciler) getHostList(pods []corev1.Pod) []string {
}

// buildScrutinizeCmdArgs returns the arguments of vcluster scrutinize command
func (s *ScrutinizeCmdArgs) buildScrutinizeCmdArgs(vdb *v1.VerticaDB) []string {
func (s *ScrutinizePodReconciler) buildScrutinizeCmdArgs(vdb *v1.VerticaDB, vscr *v1beta1.VerticaScrutinize) []string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to pass vscr. You can use s.Vscr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a separate test case(folder) and leave collect-scrutinize unchanged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new test folder collect-scrutinize-log-age added for the log age related tests.

Copy link
Collaborator

@roypaulin roypaulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • add 06-assert to the test
  • Any way to know the specified time range was applied from the collected files?

cmd = append(cmd, "--log-age-hours", strconv.Itoa(vmeta.GetScrutinizeLogAgeHours(s.Vscr.Annotations)))
} else {
if vmeta.GetScrutinizeLogAgeOldestTime(s.Vscr.Annotations) != "" {
cmd = append(cmd, "--log-age-oldest-time", s.Vscr.Annotations[vmeta.ScrutinizeLogAgeOldestTime])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd = append(cmd, "--log-age-oldest-time", s.Vscr.Annotations[vmeta.ScrutinizeLogAgeOldestTime])
cmd = append(cmd, "--log-age-oldest-time", vmeta.GetScrutinizeLogAgeOldestTime(s.Vscr.Annotations))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can manually rotate and check the time range of generated vertica.log files. But it's too long to wait because the minimum supported time range of these 3 annotations is 1 hour.

@roypaulin
Copy link
Collaborator

Add a changie entry. Ask Cai what it means.

timeOffsetFormatted := timeOffset.Format("2006-01-02") + " " + strconv.Itoa(timeOffset.Hour())

if timeZone != "" {
timeOffsetFormatted = timeOffsetFormatted + " " + timeZone
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
timeOffsetFormatted = timeOffsetFormatted + " " + timeZone
timeOffsetFormatted = fmt.Sprintf("%s %s", timeOffsetFormatted, timeZone)

// GenerateLogAgeTime returns a string in the format of YYYY-MM-DD HH [+/-XX]
func GenerateLogAgeTime(hourOffset time.Duration, timeZone string) string {
timeOffset := time.Now().Add(hourOffset * time.Hour)
timeOffsetFormatted := timeOffset.Format("2006-01-02") + " " + strconv.Itoa(timeOffset.Hour())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
timeOffsetFormatted := timeOffset.Format("2006-01-02") + " " + strconv.Itoa(timeOffset.Hour())
timeOffsetFormatted := fmt.Sprintf("%s %s", timeOffset.Format("2006-01-02"), strconv.Itoa(timeOffset.Hour()))

Comment on lines 111 to 119
if vinf.IsOlder(v1.ScrutinizeLogAgeVersion) {
ver, _ := s.Vdb.GetVerticaVersionStr()
s.VRec.Eventf(s.Vscr, corev1.EventTypeWarning, events.VclusterOpsScrutinizeNotSupported,
"The server version %s does not support scrutinize with a log time range. The minimum server version it supports is %s.",
ver, v1.ScrutinizeLogAgeVersion)
return s.updateStateAndScrutinizeReadyCondition(ctx, metav1.ConditionFalse,
events.VclusterOpsScrutinizeNotSupported, "NotReady:IncompatibleDB")
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block makes sense only when at least one of the logAge annotations is set. Otherwise we can skip this check.
Don't you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step fails in the e2e tests. Have you tried it locally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test pass locally but failed in workflow due to insufficient cpu:
logger.go:42: 20:05:53 | collect-scrutinize-log-age | 2024-09-04 19:55:52 +0000 UTC Warning Pod vertica-scrutinize-log-time FailedScheduling 0/1 nodes are available: 1 Insufficient CPU.

Added 26-delete-vscr.yaml and 46-delete-vscr.yaml to remove old vscr to release resources.

@roypaulin
Copy link
Collaborator

Looks good! will approve once all tests pass

@qindotguan qindotguan merged commit 7e73016 into main Sep 6, 2024
33 checks passed
@qindotguan qindotguan deleted the qguan/vscr-expose-time-options branch September 6, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants