-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: creation of monitor service when operator is cluster-scoped #2601
fix: creation of monitor service when operator is cluster-scoped #2601
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this should go into the version migration guide, since it requires users to change a scaffolded file.
New changes are detected. LGTM label has been removed. |
Hi @estroz, Note that in this specific scenario I am planning to do more changes in the metrics scaffold see: #2603 and #2606 So, I'd like to add all in the same section in the migration guide. Could we agree in move forward with the following PR in this case? The following PR is: https://github.com/operator-framework/operator-sdk/pull/2625/files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing the correct namespace to addMetrics
instead of interpolating it within the function is a better way to handle cluster-scoped operator metrics issues, as @shawn-hurley proposed.
update project mockdata with the current scaffold CHANGELOG CHANGELOG
Done changes requested by @estroz. Dimissing his review because he is in PTO
HI @estroz and @shawn-hurley, Your suggestion here #2601 (comment) and: Cannot be done Following the reasons: 1 - The Operator namespace is not required before we call the addMetrics, so, we have not this info before that. operator-sdk/pkg/k8sutil/k8sutil.go Line 65 in 4d5986e
3 - Note that ns that has been passed for it before this change is NOT the ns that we need in the metrics. |
Hi @estroz and @shawn-hurley, I am moving forward with this one and I hope that the #2601 (comment) clarifies why we cannot address your suggestion. However, please feel free to reach me out if you think that some following up pr should be done here. |
_, err = metrics.CreateServiceMonitors(cfg, namespace, services) | ||
|
||
// Get the namespace the operator is currently deployed in. | ||
operatorNs, err := k8sutil.GetOperatorNamespace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@camilamacedo86 - Shouldn't https://github.com/operator-framework/operator-sdk/pull/2601/files#diff-6438d7f67aed0d3fd6727a1f27133faeR237 be updated to invoke k8sutil.GetWatchNamespace()
if the CRD is cluster-scoped and WATCH_NAMESPACE
is set to ""
? When using cluster-scoped CRDs, the CRs themselves don't live in the namespace returned by k8sutil.OperatorNamespace()
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, the metrics service is created in the same namespace where the operator is deployed. See: service/memcached-operator-metrics
$ kubectl get all -n memcached
NAME READY STATUS RESTARTS AGE
pod/example-memcached-7c4df9b7b4-lzd6j 1/1 Running 0 64s
pod/example-memcached-7c4df9b7b4-wbtkz 1/1 Running 0 64s
pod/example-memcached-7c4df9b7b4-wt6jb 1/1 Running 0 64s
pod/memcached-operator-56f54d84bf-zrtfv 1/1 Running 0 69s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/example-memcached ClusterIP 10.108.124.47 <none> 11211/TCP 63s
service/memcached-operator-metrics ClusterIP 10.108.67.82 <none> 8383/TCP,8686/TCP 66s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/example-memcached 3/3 3 3 65s
deployment.apps/memcached-operator 1/1 1 1 70s
NAME DESIRED CURRENT READY AGE
replicaset.apps/example-memcached-7c4df9b7b4 3 3 3 65s
replicaset.apps/memcached-operator-56f54d84bf 1 1 1 70s
I think you are trying to raise here the scenario where the service/CR
should be created in the namespace where the CR is applied. Am I right? I am raising an issue to check this scenario and see how better we can address it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See; #2707
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I'm not talking about the creation of the actual Kubernetes Service and ServiceMonitor resources. I'm talking about the way in which the Go code that serves the custom resource metrics tracks which custom resources to report on. After filtering the GVKs, serveCRMetrics
restricts which namespaces should be searched for custom resources:
// Get the namespace the operator is currently deployed in.
operatorNs, err := k8sutil.GetOperatorNamespace()
if err != nil {
return err
}
// To generate metrics in other namespaces, add the values below.
ns := []string{operatorNs}
// Generate and serve custom resource specific metrics.
err = kubemetrics.GenerateAndServeCRMetrics(cfg, ns, filteredGVK, metricsHost, operatorMetricsPort)
if err != nil {
return err
}
The call to k8sutil.GetOperatorNamespace()
, returns the name of the namespace in which the operator is running, but cluster-scoped resources don't belong to a namespace. I think the call to k8sutil.GetOperatorNamespace()
need to be replaced with a call to k8sutil. GetWatchNamespace()
. Only then will the namespace against which to filter CRs be correctly set to ""
when using cluster-scoped CRDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @robbie-demuth,
Note that it was NOT really changed in this PR. Before that, it was using the OperatorNamespace. See; https://github.com/operator-framework/operator-sdk/blob/v0.15.x/internal/scaffold/cmd.go#L212. So, let's discuss your suggestion and check it via issue #2707
c/c @estroz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I just commented here because this PR was listed as a fix for #1858 and, without the update to serveCRMetrics
, the issue isn't really resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that issue #1858 can be caused by many raisons and we addressed the diff scenarios in more than one pr. If you check the current scaffold you will see:
// The metrics will be generated from the namespaces which are returned here.
// NOTE that passing nil or an empty list of namespaces in GenerateAndServeCRMetrics will result in an error.
ns, err := kubemetrics.GetNamespacesForMetrics(operatorNs)
if err != nil {
return err
}
// Generate and serve custom resource specific metrics.
err = kubemetrics.GenerateAndServeCRMetrics(cfg, ns, filteredGVK, metricsHost, operatorMetricsPort)
if err != nil {
return err
}
Then, it has a clear msg over what is expected in this place. NOTE that passing nil or an empty list of namespaces in GenerateAndServeCRMetrics will result in an error.
and then, because of this, we cannot use the WATCH_NAMESPACE in this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated one of my projects to call k8sutil.GetWatchNamespace()
instead of k8sutil.GetOperatorNamespace()
and it does work. It's important to note that if WATCH_NAMESPACE
is set to ""
, the list of namespaces passed to GenerateAndServeCRMetrics
won't be nil or empty. Instead it'll be equivalent to []string{""}
Description of the change:
ServiceMonitor
creation when the operator is cluster-scoped and the environment variableWATCH_NAMESPACE
has not as value the namespace where the operator was deployed.Motivation for the change:
Closes: #2602
See: #2494 (comment)