Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix error to create service metrics when run operator-sdk up local #2190

Merged
merged 2 commits into from
Jan 6, 2020
Merged

fix error to create service metrics when run operator-sdk up local #2190

merged 2 commits into from
Jan 6, 2020

Conversation

camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Nov 13, 2019

Description of the change:
Skip service metrics when is running out o the cluster

Motivation for the change:

Closes: #1702

LOCAL TEST
Screenshot 2019-12-12 at 00 37 43

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 13, 2019
@joelanford
Copy link
Member

I'm confused about the motivation for this. As far as I can tell, we already skip metrics service creation during up local

operator-sdk up local output from an operator built from master:

...
{"level":"info","ts":1573667590.2471445,"logger":"cmd","msg":"Could not generate and serve custom resource metrics","error":"operator run mode forced to local"}
{"level":"info","ts":1573667590.448851,"logger":"cmd","msg":"Could not create metrics Service","error":"failed to initialize service object for metrics: OPERATOR_NAME must be set"}
{"level":"info","ts":1573667590.6498501,"logger":"cmd","msg":"Could not create ServiceMonitor object","error":"no ServiceMonitor registered with the API"}
{"level":"info","ts":1573667590.6498818,"logger":"cmd","msg":"Install prometheus-operator in your cluster to create ServiceMonitor objects","error":"no ServiceMonitor registered with the API"}
...

Is this just about improving the log messages?

@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Nov 13, 2019

Hi @joelanford,

We do not skip actually. We just throw the error which makes the users thinking that something is wrong so the idea here is to keep as it is done for the leader and make clear that it was not created and why instead of throw errors. See:

{"level":"info","ts":1573666348.370976,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
`

@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Nov 13, 2019

HMO: Technically is a very simple change. It is a nit as you can see :-) But very helpful to not cause misunderstanding and improve the user experience. I mean, since the info does not clearly show a bug in the user point of view.

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 13, 2019
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 13, 2019
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 13, 2019

@camilamacedo86: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/marker 74934d8 link /test marker
ci/prow/e2e-aws-ansible 74934d8 link /test e2e-aws-ansible

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@camilamacedo86 camilamacedo86 added metrics kind/bug Categorizes issue or PR as related to a bug. labels Nov 16, 2019
Copy link
Contributor

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! I was a bit confused by this error message 😄

@camilamacedo86 camilamacedo86 requested a review from estroz December 2, 2019 17:16
@camilamacedo86 camilamacedo86 changed the title fix error to create service metrics when run operator-sdk up local wIp : fix error to create service metrics when run operator-sdk up local Dec 2, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 2, 2019
@camilamacedo86 camilamacedo86 changed the title wIp : fix error to create service metrics when run operator-sdk up local WIP : fix error to create service metrics when run operator-sdk up local Dec 2, 2019
@openshift-ci-robot openshift-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 3, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 3, 2019
@camilamacedo86 camilamacedo86 requested a review from estroz December 4, 2019 12:53
@openshift-ci-robot openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Dec 9, 2019
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 12, 2019
Copy link
Member

@estroz estroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 12, 2019
@camilamacedo86 camilamacedo86 added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 14, 2019
Comment on lines +153 to +154
// Add the Metrics Service
addMetrics(ctx, cfg, namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like addMetrics() is a small wrapper around serveCRMetrics() that does some extra logging and error handling, right?

IMO, another function in the main.go template for this is a bit much, and I think we could add the extra error handling in line in the body of the main function.

If I understand correctly, it seems like this PR could be reduced to:

        if err = serveCRMetrics(cfg); err != nil {
-               log.Info("Could not generate and serve custom resource metrics", "error", err.Error())
+               if errors.Is(err, k8sutil.ErrRunLocal) {
+                       log.Info("Skipping CR metrics server creation; not running in a cluster.")
+               } else {
+                       log.Info("Could not generate and serve custom resource metrics", "error", err.Error())
+               }
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logs/msgs came from more than one point in the original code (See serveCRMetrics, CreateMetricsService, CreateServiceMonitors). The idea was centralized/encapsulate the logic to add the metrics and do its improvement.

{"level":"info","ts":1563485392.3758454,"logger":"cmd","msg":"Could not generate and serve custom resource metrics","error":"operator run mode forced to local"}
{"level":"info","ts":1563485392.4013908,"logger":"cmd","msg":"Could not create metrics Service","error":"failed to initialize service object for metrics: OPERATOR_NAME must be set"}
{"level":"info","ts":1563485392.4066937,"logger":"cmd","msg":"Could not create ServiceMonitor object","error":"no ServiceMonitor registered with the API"}

In this way, just the above suggestion still not attending the expectations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see now. The diff was confusing me :)

CHANGELOG.md Outdated
@@ -17,6 +17,7 @@

### Bug Fixes

- Skip metrics creation when running the operator locally. ([#2190](https://github.com/operator-framework/operator-sdk/pull/2190))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already skip metrics creation when running locally, right? This PR is about logging a more informative message to help the user understand that skipping metrics creation is not an error, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current msgs/loggings give an impression that it is an error when is not.
I improved the description and move to changed section in the changelog.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 6, 2020
Copy link
Member

@joelanford joelanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 6, 2020
@camilamacedo86 camilamacedo86 merged commit 81c6c8a into operator-framework:master Jan 6, 2020
@camilamacedo86 camilamacedo86 deleted the fix-metrics-local branch January 6, 2020 17:15
qu1queee added a commit to qu1queee/build that referenced this pull request Feb 19, 2021
This code was generated as part of the scaffold when using the sdk. This
code was intended to mitigate a user error observed when using
"operator-sdk up local" cmd.

See operator-framework/operator-sdk#2190 for
more information.

We do not longer need this code. The metrics endpoint for prometheus is
initialized by controller-runtime in the background.

Signed-off-by: Matthias Diester <matthias.diester@de.ibm.com>
qu1queee added a commit to qu1queee/build that referenced this pull request Feb 19, 2021
This code was generated as part of the scaffold when using the sdk. This
code was intended to mitigate a user error observed when using
"operator-sdk up local" cmd.

See operator-framework/operator-sdk#2190 for
more information.

We do not longer need this code. The metrics endpoint for prometheus is
initialized by controller-runtime in the background.

Signed-off-by: Matthias Diester <matthias.diester@de.ibm.com>
qu1queee added a commit to qu1queee/build that referenced this pull request Feb 22, 2021
This code was generated as part of the scaffold when using the sdk. This
code was intended to mitigate a user error observed when using
"operator-sdk up local" cmd.

See operator-framework/operator-sdk#2190 for
more information.

We do not longer need this code. The metrics endpoint for prometheus is
initialized by controller-runtime in the background.

Signed-off-by: Matthias Diester <matthias.diester@de.ibm.com>
qu1queee added a commit to qu1queee/build that referenced this pull request Feb 22, 2021
This code was generated as part of the scaffold when using the sdk. This
code was intended to mitigate a user error observed when using
"operator-sdk up local" cmd.

See operator-framework/operator-sdk#2190 for
more information.

We do not longer need this code. The metrics endpoint for prometheus is
initialized by controller-runtime in the background.

Signed-off-by: Matthias Diester <matthias.diester@de.ibm.com>
qu1queee added a commit to qu1queee/build that referenced this pull request Feb 22, 2021
This code was generated as part of the scaffold when using the sdk. This
code was intended to mitigate a user error observed when using
"operator-sdk up local" cmd.

See operator-framework/operator-sdk#2190 for
more information.

We do not longer need this code. The metrics endpoint for prometheus is
initialized by controller-runtime in the background.

Signed-off-by: Matthias Diester <matthias.diester@de.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metrics Errors when running operator-sdk up local
5 participants