Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add GrafanaLibraryPanel CRD and reconciler #1858

Merged
merged 17 commits into from
Feb 27, 2025

Conversation

diurnalist
Copy link
Contributor

@diurnalist diurnalist commented Feb 11, 2025

this partially implements the feature proposed in #1611, which allows the operator to manage library panels just as it manages dashboards.

this is a large PR but i have made an attempt at breaking it into smaller pieces, first adding the CRD types, then the reconciler implementation, then a few refactor commits to make things cleaner and re-use or update existing abstractions.

my first approach was to copy the Dashboard reconciler, and I actually had that finished 😅 before realizing that the newer controllers were probably better to take inspiration from, as they seem to use more shared and modern patterns. so i rewrote it to share more with the alert* reconcilers.

library panels are a bit weird in that they have the concept of update vs. create, which we have to understand in the reconciler. otherwise they are very similar to dashboards and i worked to port most of the logic that dashboards have around setting custom UIDs and otherwise falling back to what is defined on the content model JSON.

this does not implement anything fancy around linking library panels to dashboards via references. i think that's an area of future improvement for this, so you can loosely couple dashboard and library panels when both are configured as code.

@github-actions github-actions bot added the documentation Issues relating to documentation, missing, non-clear etc. label Feb 11, 2025
@diurnalist diurnalist force-pushed the f/library-panels-2 branch 3 times, most recently from 4e48b30 to 7488d2e Compare February 12, 2025 02:15
@diurnalist diurnalist marked this pull request as ready for review February 12, 2025 02:22
@diurnalist
Copy link
Contributor Author

i did not add unit tests for the reconciler, because it was not clear to me how to effectively implement them, and there were no other examples in the repo of tests for reconcilers. i am continuing to do some testing in my own cluster and will think about unit tests we could implement.

@diurnalist
Copy link
Contributor Author

howdy @weisdd @NissesSenap - just wanted to ping on this, i know it is a larger PR. please let me know if you would prefer breaking it into smaller PRs rather than reviewing commit-by-commit. i'm highly motivated to finish the work so that others in the community can take advantage of this feature and i can remove the need for our custom fork of the operator!

@theSuess
Copy link
Member

Hey @diurnalist - I'll take a look today. We haven't had a chance to look at this yet due to us working on other projects as well 😅

Copy link
Member

@theSuess theSuess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough PR!

The overall structure looks good, but I see potential for simplifying some operations as some of the patterns you replicated are only there because of legacy reasons.

If you have any questions, feel free to @ me directly!

Copy link
Contributor Author

@diurnalist diurnalist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weisdd @theSuess thanks very much for the review! re: other work demands, totally understand :)

i will try to address all the feedback today or tomorrow.

diurnalist and others added 9 commits February 19, 2025 10:23
this adds a new GrafanaLibraryPanel CRD, updates all the auto-generated
assets derived from the CRD type, and updates existing RBAC and Helm
definitions to reference the new type.

Co-authored-by: Matt Delaney <matt.delaney@getcruise.com>
this registers the new CRD with the Grafana CRD so that we can track
which library panels have been provisioned with the operator, similar
to how other resources are tracked.

Co-authored-by: Matt Delaney <matt.delaney@getcruise.com>
Co-authored-by: Matt Delaney <matt.delaney@getcruise.com>
much of the content resolving/fetching logic is shared b/w dashboards
and library panels, but there is one awkward thing where library panels
are not currently (to my knowledge) available via grafana.com. rather
than extract just that fetch logic out and only apply it to dashboards,
i've opted to simply disable them as a source. an error will be issued
if a user tries to configure this field. it is not the most graceful
way to handle this, as it's not communicated via the CR validation, but
given the effort required to bend the code to do this "right", and
given the likelihood of somebody trying to configure a library panel
via a grafana.com url (which does not exist right now), we are choosing
the easy button.
to enable dashboards and library panels to track their fetch operations
separately, split up the metrics by resource. this is accomplished by
adding a new interface method to the GrafanaContentResource, and the
returned struct holds references to the relevant metrics, if any.

this also adds an additional guard to the http round tripper so that if
no metric is defined, we don't panic.
since grafana#1856, the api module is separated. this poses problems
with the approach to have the CRs expose which metrics correspond
to their types via the GrafanaContentMetrics() function, because
we create an import loop.

recognizing that we're measuring essentially the same thing b/w
library panels and dashboards, this updates the metrics to indicate
this more explicitly.

the following changes are made to metrics:

  `grafana_operator_dashboard_requests`:
    * renamed to `grafana_operator_content_requests`, but will always have a label
      `kind="GrafanaDashboard"`.
    * renamed `dashboard` label to `resource`

  `grafana_operator_dashboard_revision_requests`:
    * renamed to `grafana_operator_revision_requests`, but will always have a label
      `kind="GrafanaDashboard"`.
    * renamed `dashboard` label to `resource`
@diurnalist
Copy link
Contributor Author

@theSuess i think i've addressed the requested changes! unfortunately i had to add a new commit that changes a couple of metrics due to the api module being isolated, which added some constraints about what it should depend on.

@diurnalist diurnalist requested a review from theSuess February 20, 2025 20:02
Copy link
Member

@theSuess theSuess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two more comments, looks good otherwise! Thanks for being on top of this

to make the dashboard url request metric backwards compatible, this
updates the signature of the roundtripper to be a bit more general,
so we can pass a function that is called at the end of the response
rather than explicitly increment a metric. this allows us to increment
multiple metrics in the case of the dashboard url request, one for the
old and one for the new.
@diurnalist diurnalist requested a review from theSuess February 21, 2025 17:21
@theSuess theSuess added the feature this PR introduces a new feature label Feb 24, 2025
@theSuess
Copy link
Member

I tried my hand at simplifying the metrics logic and reconciler a bit more. Since this now includes code of mine as well, I'd appreciate a review from another maintainer as well

Copy link
Contributor Author

@diurnalist diurnalist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplification looks good :) - i like simply updating the grafana CR status even if there was an error, i guess that's how it should be.

Copy link
Collaborator

@Baarsgaard Baarsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes and some usability considerations, great work on the implementation!

@diurnalist
Copy link
Contributor Author

@Baarsgaard thank you!
@theSuess i can integrate these comments and also fix the lint issues.

@diurnalist
Copy link
Contributor Author

@Baarsgaard thanks for the tips about the error handling. i have moved the validation checks earlier and we are now registering the InvalidSpec condition if those checks fail. also applied cleanup to the other points you mentioned :)

diurnalist and others added 2 commits February 26, 2025 08:34
Co-authored-by: Steffen Baarsgaard <steff.bpoulsen@gmail.com>
@theSuess theSuess added this pull request to the merge queue Feb 27, 2025
Merged via the queue into grafana:master with commit 9bd4baa Feb 27, 2025
15 checks passed
@diurnalist
Copy link
Contributor Author

diurnalist commented Feb 27, 2025

thanks everybody for the detailed reviews and timely feedback ❤️ . i learned many new things about good operator coding practices and patterns.

as a next step i will write up some docs and add more examples to illustrate how this capability can work.

@diurnalist diurnalist deleted the f/library-panels-2 branch February 27, 2025 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Issues relating to documentation, missing, non-clear etc. feature this PR introduces a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants