Releases · Cloudzero/cloudzero-charts

19 Mar 00:33

github-actions

1.1.0-beta-1

7f74691

1.1.0-beta-1

1.1.0-beta-1 (2025-03-18)

Initial (beta) release of the new CloudZero Aggregator.

Upgrade Steps

Upgrade with:

helm upgrade --install -n cloudzero-agent cloudzero-beta -f configuration-example.yaml

See the beta installation instructions for further detail

Bug Fixes

Update nodeSelector settings: The nodeSelector is now available for the initCertJob and initBackfillJob jobs.
nodeSelector, tolerations, and affinity settings moved: These settings have now moved to the insightsController.server section.

Improvements

CloudZero Aggregator: The CloudZero Aggregator (affectionately known as "The Gator") is a new component that sits between the CloudZero Agent and the CloudZero Platform. The Gator aggregates metrics into a local cache before sending them in larger batches to the CloudZero Platform. This provides substantial improvements in reliability, performance, disaster recovery, user-friendliness, and more.
Reduce scrape interval:: The scrape interval was previously set to every 2 minutes, this has been reduced to every 1 minute.

Assets 2

18 Mar 20:51

github-actions

1.0.2

da45fd8

1.0.2 Latest

Latest

Release 1.0.2 (2025-03-18)

This release fixes an issue with helm chart templating, as well as makes an improvement to the sampling rate of the Prometheus agent.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.2

Bug Fixes

Node Scheduling Settings Fixed: Fixes an issue in which the initCertJob did not have the option to set nodeSelector, affinity, or tolerations. Additionally, these settings can now be set for each initialization Job individually.
Values File Documentation Fixed: Fixes an issue in which the node scheduling settings for the insightsController were indented to the wrong level.

Improvements

Default Scrape Interval Set to 60s: The default scrape_interval setting used by the internal Prometheus agent is updated from 120s to 60s. This improvement makes it more likely that the agent captures usage information for short-lived pods.

Assets 3

11 Mar 17:18

github-actions

1.0.1

8366729

1.0.1

This release fixes two issues relating to template rendering and TLS certificate generation, as well as adding documentation for Istio enabled clusters. In addition, some other bug fixes around prometheus metrics, logging, and sqlite were added.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.1

Bug Fixes

Webhook Resource Names Trimmed Appropriately: Fixes an issue in which the name used by webhook resources adds a suffix after trimming, which can potentially allow resource names that violate Kubernetes naming rules.
Certificate Generation Runs For All Webhook Configuration Changes: Fixes an issue in which the TLS certificate generation initialization Job does not run if a ValidatingWebhookConfiguration is created after initial installation.
Invalid Prometheus Metric Label Name: Fixes an issue where supplying an invalid label name to a Prometheus metric causes a panic.
Utilization of Default Kubernetes Logger: Removes the last utilization of the default Kubernetes logger, which causes logging levels defined in the configuration to not be respected.

Improvements

Shorter TTL for init-cert Job: The init-cert Job is now cleaned up after 5 seconds, so that repeated installations regenerate certificates as needed.
Improvements to SQLite Testing: The SQLite connection string was edited for improved clarity, and a concurrency test was added.
Various Logging Changes: Some logging messages were downgraded from info to debug.

Assets 3

17 Feb 13:53

github-actions

1.0.0-rc4

33babe6

1.0.0-rc4

Release 1.0.0-rc4 (2025-02-16)

This release makes improvements to the certificate initialization Job so that more invalid states can be rectified. Additionally, annotations can now be added to initialization Jobs. Expiration of both initialization Jobs is not configurable.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc4

See upgrades.md for full documentation of upgrade behavior as it relates to initialization Jobs.

Improvements

Certificate Initialization Job Checks For More Invalid Conditions: The certificate initialization job now checks for certificates with invalid SAN settings, mismatches between webhook configurations, and mismatches between the webhook caBundle value and the ca.crt value in the TLS secret.
Automatic Job Cleanup Configuration: TTL for both initialization Jobs is now configurable, and defaults to 180 seconds.
Initialization Job Annotation Support: Both initialization Jobs allow the user to set annotations. This was specifically added to make management via ArgoCD easier, as ArgoCD will consider expired Jobs to be OutOfSync with the release source. See upgrades.md for details on recommended annotations.

Assets 3

17 Feb 15:50

github-actions

1.0.0

c62afbb

1.0.0

Release 1.0.0 (2025-02-17)

This release introduces native Kubernetes Labels and Annotations support to the CloudZero platform. You can now identify Kubernetes dimensions based on the Labels and Annotations used in your Kubernetes deployments.

New Features

Kubernetes Labels and Annotations: Enhance your ability to categorize and manage resources by leveraging Labels and Annotations directly within the CloudZero platform.

Configuration Changes

To take advantage of these new features, update your Helm chart configuration as outlined below.

Example `example-override-values.yaml` File:

# -- UNCHANGED: Cloud Service Provider Account ID
#    This must be a string - even if it is a number in your system.
#    Adding a new line here is an easy workaround.
cloudAccountId: |-
  null

# -- UNCHANGED: The Cluster name
clusterName: null

# -- UNCHANGED: The Cloud Service Provider Region
region: null

# -- UNCHANGED: CloudZero API key. Required if existingSecretName is null.
apiKey: null

# -- UNCHANGED: If set, the agent will use the API key in this Secret to authenticate with CloudZero.
existingSecretName: null

# -- NEW: Flag to deploy the Jetstack.io "cert-manager". Most environments will already have this deployed,
#    so set this to "false" if applicable. Otherwise, enabling this to "true" is a quick way to get started.
#    See the README for more information.
cert-manager:
  # -- DEFAULT: enabled.
  enabled: true | false

# -- NEW: Service Account used for the Insights Controller
#    The account is required. If you already have an existing account, set the name in the field below.
serviceAccount:
  # -- DEFAULT: create the service account.
  create: true | false
  name: ""
  annotations: {}

# -- NEW: Label and Annotation Configuration
insightsController:
  # -- By default, a ValidatingAdmissionWebhook will be deployed to record all created labels and annotations.
  enabled: true | false
  labels:
    # -- DEFAULT: enabled.
    enabled: true | false
    # -- This value MUST be set to a list of regular expressions used to gather labels from pods,
    #    deployments, statefulsets, daemonsets, cronjobs, jobs, nodes, and namespaces.
    patterns:
      # List of Go-style regular expressions used to filter desired labels.
      # Caution: The CloudZero system has a limit of 300 labels and annotations,
      # so it is advisable to provide a specific list of required labels.
      - '.*'
  annotations:
    # -- DEFAULT: disabled.
    enabled: true | false
    patterns:
      # List of Go-style regular expressions used to filter desired annotations.
      # Caution: The CloudZero system has a limit of 300 labels and annotations,
      # so it is advisable to provide a specific list of required annotations.
      - '.*'

Upgrade Instructions

If you have an existing CloudZero Agent deployment, follow these steps to upgrade:

Define the values.yaml Override Configuration:

Ensure your values.yaml override configuration includes the new settings outlined above. Note that some existing values may no longer be necessary.

Update the Helm Chart Repository:

helm repo add cloudzero https://cloudzero.github.io/cloudzero-charts
helm repo update

Upgrade the Deployment:
```
helm upgrade --install <YOUR_RELEASE_NAME> -n <YOUR_NAMESPACE> cloudzero -f override-values.yaml
```
Replace <YOUR_RELEASE_NAME> with the name you used to release the chart into your environment.

Replace <YOUR_NAMESPACE> with the namespace you used for your deployment.

Deprecations and Breaking Changes

node-exporter Deprecation:

The node-exporter has been deprecated and is no longer used.
External kube-state-metrics Deprecation:

External kube-state-metrics has been deprecated. We now deploy an instance within the CloudZero Agent deployment named cloudzero-state-metrics, which is not discoverable by other monitoring platforms and ensures the necessary configuration is defined for telemetry collection requirements. If you host the images in a private image repository, you can override the following in the values.yaml file:
```
kubeStateMetrics:
  image:
    registry: registry.k8s.io
    repository: kube-state-metrics/kube-state-metrics
```
API Key Management Argument Relocation:
- API key management arguments have moved to the global section.
- Previously, you could pass an apiKey or existingSecretName argument directly to the chart.
- These arguments should now be passed as global.apiKey and global.existingSecretName, respectively.

Security Scan Results

Image	Scanner	Scan Date	Critical	High	Medium	Low	Negligible
ghcr.io/cloudzero/cloudzero-insights-controller/cloudzero-insights-controller:0.1.0	Grype	2024-12-23	0	0	0	0	0
ghcr.io/cloudzero/cloudzero-agent-validator/cloudzero-agent-validator:0.10.0	Grype	2024-12-23	0	0	0	0	0

Summary of Changes:

Typos and Grammar:
- Corrected "Annotaitons" to "Annotations".
- Ensured consistent use of "Go-style" instead of "golang style".
Clarity and Consistency:
- Enhanced section headings for better readability.
- Clarified comments within the YAML example for better understanding.
- Ensured consistent capitalization of terms like "Labels" and "Annotations".
Formatting:
- Fixed indentation in the kubeStateMetrics YAML snippet.
- Improved bullet points and indentation for better visual structure.
- Ensured code blocks and commands are clearly separated from the text.
Additional Notes:
- Added clearer instructions in the deprecation section for kube-state-metrics.
- Maintained consistent terminology and formatting throughout the document.

Assets 3

14 Feb 17:26

github-actions

1.0.0-rc3

2a9251e

1.0.0-rc3

Release 1.0.0-rc3 (2025-02-13)

This release makes improvements to the upgrade process as it relates to management of the initialization Jobs.

Upgrade Steps

This upgrade should be force installed. Meaning, users managing with helm directly should include the --force flag when upgrading. Alternatively, uninstall and reinstall the helm release. Users managing the release with tools such as ArgoCD should choose an upgrade strategy that does a full replacement.

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc3 --force

See upgrades.md for full documentation of upgrade behavior as it relates to initialization Jobs.

Improvements

Certificate Initialization Job Runs Every Upgrade: The certificate initialization job now runs on every upgrade and does a better job of ensuring that the certificate is generated correctly and is being used. This means that the --force flag used in the helm upgrade command will always create a new certificate. Running helm upgrade without --force will not regenerate the certificate.
Automatic Job Cleanup: Both initialization jobs are now automatically cleaned up after a period of time, which ensures that Jobs are rerun when appropriate.
Certificate Initialization Job ClusterRole: The certificate initialization job now has a dedicated ClusterRole, ClusterRoleBinding, and ServiceAccount. This is done to separate required permissions and only grant PATCH permission to a very narrow resource scope.

Assets 3

12 Feb 16:07

github-actions

1.0.0-rc2

386cf41

1.0.0-rc2

Release 1.0.0-rc2 (2025-02-12)

This release fixes an issue in which the internal TLS certificate could create a SAN field with an incorrect service address.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc2

Bug Fixes

SAN Field Properly Formatted: Previously, users installing the agent in a non-default namespace who were also using the internal TLS certificate generation may have run into an issue in which the certificate is improperly generated. The template now takes the release namespace into account.

Assets 3

29 Jan 20:55

github-actions

1.0.0-rc1

47b7c46

1.0.0-rc1

Release 1.0.0-rc1 (2025-01-23)

This release contains several improvements from 1.0.0-beta-10:

The name of the initialization Job that gathers information about existing state of a cluster now includes the version of the chart and the image tag used in the Pod.
The initScrapeJob field is deprecated in favor of initBackfillJob. However, this is not a breaking change; initScrapeJob can still be used without issue.
The server.agentMode boolean argument is now provided.
Improvements are made to the resource consumption of the agent-server pod.
Metrics from the agent-server pod are made available for monitoring.

Upgrade Steps

Optionally rename the initScrapeJob field in any override files with initBackfillJob. initBackfillJob is the preferred field, but configurations using initScrapeJob will still work.

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-rc1

Improvements

Initialization Job Name Changes With Releases: It was previously possible to have failures in release upgrades if the container image used in the Job changed. This is because the image field in a Job spec is immutable. To prevent this, a new Job is created every time the Helm chart version is changed and/or when the image used in the Job is changed. This also ensures that changes to the underlying insights-controller application will be used in the new backfill of existing cluster state data.
Clarified Field Names: The Job used for gathering existing cluster data was previously controlled via a field named initScrapeJob. This is an overloaded term given that this chart also uses the term "scrape job" in the context of Prometheus. This has caused some confusion, so the field is now renamed to initBackfillJob. initScrapeJob is still usable, and values from initScrapeJob are merged with initBackfillJob with the latter having precedence.
Easier Debugging: The server.agentMode field can be toggled to false; by default it is set to true so that the Prometheus server runs in agent mode to keep resource usage manageable. Setting the field to false takes the Prometheus server out of agent mode. This is helpful for debugging issues with the Prometheus agent-server.
Resource Consumption Reduction: The Prometheus scrape job used to gather metrics from the insights-controller pods now restricts the metrics scraped to ones explicitly set in the values.yaml. This means that the internal TSDB must hold less data.
Improved Observability: The agent-server now scrapes itself for metrics and exports them for monitoring by the CloudZero platform. This means that issues within a cluster can be detected much sooner and with greater visibility into the cause of the issue.

Assets 3

17 Jan 17:36

github-actions

1.0.0-beta-10

fc93959

1.0.0-beta-10

Release 1.0.0-beta-10 (2025-01-17)

This release adds logic to ensure that the static target used in the env-validator and in the Prometheus configuration always matches the internal Service created by the kube-state-metrics subchart.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-10

Improvements

Static Target and KSM Service Always Match: Both the env-validator and the Prometheus agent require an address for a kube-state-metrics Service. By default, the Service name generated by the kube-state-metrics subchart generates a name that matches the target value generated by the chart.

However, if the user overrides the name of the kube-state-metrics Service using kubeStateMetrics.fullnameOverride, there can be a mismatch between the names. This change attempts to mirror the logic used by the internal kube-state-metrics chart so that the target and Service names will match regardless of user input.

Assets 2

15 Jan 22:32

github-actions

1.0.0-beta-9

2470ca7

1.0.0-beta-9

Release 1.0.0-beta-9 (2025-01-15)

This release adds the ability to set the log level via the insightsController.server.logging.level field. Additionally, the interval in which data is written to the CloudZero platform and the timeout for writing data are configurable via insightsController.server.send_interval and insightsController.server.send_timeout, respectively. The default timeout is increased from 10s to 1m.

The kube-state-metrics subchart section now explicitly includes container image information. This introduces no functional changes; it is intended to make it clearer to the user which images will be used and from where they will be pulled.

Upgrade Steps

Upgrade using the following command:

helm upgrade --install <RELEASE_NAME> cloudzero-beta/cloudzero-agent -n <NAMESPACE> --create-namespace -f configuration.example.yaml --version 1.0.0-beta-9

Bug Fixes

KSM Address: Fixes an issue in which the internal kube-state-metrics service address can be templated incorrectly.

Improvements

More Configurable Server Settings: The log level, remote write interval, and remote write timeout are now configurable in the chart values. See the insightsController.server section in the values.yaml for more details.
Default Setting for Send Timeout: The default remote write timeout is increased to 1m, which allows for backfilling data from larger clusters.
Container Image Information Added: The values passed to the internal kube-state-metrics subchart now explicitly set the container image registry, repository, and tag information for the purposes of documentation.

Assets 2

Releases: Cloudzero/cloudzero-charts

1.1.0-beta-1

1.1.0-beta-1 (2025-03-18)

Upgrade Steps

Bug Fixes

Improvements

1.0.2

Release 1.0.2 (2025-03-18)

Upgrade Steps

Bug Fixes

Improvements

1.0.1

Release 1.0.1 (2025-03-02)

Upgrade Steps

Bug Fixes

Improvements

1.0.0-rc4

Release 1.0.0-rc4 (2025-02-16)

Upgrade Steps

Improvements

1.0.0

Release 1.0.0 (2025-02-17)

New Features

Configuration Changes

Example example-override-values.yaml File:

Upgrade Instructions

Deprecations and Breaking Changes

Security Scan Results

Summary of Changes:

1.0.0-rc3

Release 1.0.0-rc3 (2025-02-13)

Upgrade Steps

Improvements

1.0.0-rc2

Release 1.0.0-rc2 (2025-02-12)

Upgrade Steps

Bug Fixes

1.0.0-rc1

Release 1.0.0-rc1 (2025-01-23)

Upgrade Steps

Improvements

1.0.0-beta-10

Release 1.0.0-beta-10 (2025-01-17)

Upgrade Steps

Improvements

1.0.0-beta-9

Release 1.0.0-beta-9 (2025-01-15)

Upgrade Steps

Bug Fixes

Improvements

Example `example-override-values.yaml` File: