Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain Kafka metrics may not be available #3325

Closed
calohmn opened this issue Jul 5, 2022 · 0 comments · Fixed by #3326
Closed

Certain Kafka metrics may not be available #3325

calohmn opened this issue Jul 5, 2022 · 0 comments · Fixed by #3326
Assignees
Milestone

Comments

@calohmn
Copy link
Contributor

calohmn commented Jul 5, 2022

After the Hono 2.0 update, certain Kafka metrics sometimes don't seem to be available.
This has been observed for Kafka consumer metrics that have a topic tag, for example kafka.consumer.fetch.manager.records.consumed.total.
The issue is, that for this metric, there is only one meter being registered (per client-id). That means that the number of records consumed is only available for one random topic, meaning a chart for that metric is potentially mostly empty.


Analysis:
The Micrometer KafkaMetrics class registers the meters for the Kafka metrics in the metrics registry.
In the checkAndBindMetrics() method, there is special logic to skip adding meters that have less tags than an already registered meter with the same name (PrometheusMeterRegistry only supports an equal number of tags for meters with the same name - see micrometer-metrics/micrometer#877).
This logic is needed for example for the kafka.consumer.fetch.manager.records.consumed.total metrics. These metrics are observed for each topic (i.e. containing the topic tag) and there is also one metric entry for the overall number (without the topic tag). The meter for the entry without the topic tag shall not be registered (or shall be removed during the next invocation).

This logic for checking the number of tags involves also considering the common tags, added via MeterFilters to all meters.
This list of common tags is only initialized once in KafkaMetrics.

This issue now is that since Hono has moved to using quarkus-micrometer, the initialization of these MeterFilters (done by the Quarkus MicrometerRecorder) is now potentially taking place at a later point in time.

Log output:

12:22:19,157 INFO  [org.ecl.hon.cli.kaf.met.MicrometerKafkaClientMetricsSupport] (main) activating Kafka client metrics support
12:22:19,162 INFO  [org.ecl.hon.com.app.Application] (main) deploying 2 Hono Command Router instances ...
# Kafka client already adds metric here
12:22:19,728 TRACE [org.apa.kaf.com.met.Metrics] (vert.x-eventloop-thread-4) Registered metric named MetricName [name=records-consumed-total, [...], tags={client-id=command-hono-service-command-router-556c49bd4-882kc_0c0777b13b2c_2}]

# Micrometer KafkaMetrics.checkAndBindMetrics seems to have been called here already (no logs)

12:22:20,521 DEBUG [io.qua.mic.run.MicrometerRecorder] (main) Discovered global MeterFilters : [io.micrometer.core.instrument.config.MeterFilter$9@4cfcac13, io.micrometer.core.instrument.config.MeterFilter$9@5c25d0d1, io.micrometer.core.instrument.config.MeterFilter$1@22c8ee48, io.micrometer.core.instrument.config.MeterFilter$9@7845b21a]
12:22:20,534 DEBUG [io.qua.mic.run.MicrometerRecorder] (main) Discovered MeterFilters for class io.micrometer.prometheus.PrometheusMeterRegistry: []

# Kafka client adds another metric here
12:23:20,288 TRACE [org.apa.kaf.com.met.Metrics] (vert.x-kafka-consumer-thread-2) Registered metric named MetricName [name=records-consumed-total, group=consumer-fetch-manager-metrics, description=The total number of records consumed for a topic, tags={client-id=command-hono-service-command-router-556c49bd4-882kc_0c0777b13b2c_2, topic=hono_command_thealthcheck0}]

# Micrometer KafkaMetrics.checkAndBindMetrics for this new metric must have failed (no logs)

That means that the Micrometer KafkaMetrics instance is using an empty common tags list for deciding whether a meter should be registered - if there is already an existing meter with that name. This leads to such meters not being added because the existing meter tag list (containing the common tags) is always longer than the tag list of the traversed metric with the empty common tags list. (The empty KafkaMetrics instance "common tags" list has also been observed in heap dumps here.)

Solution:
The fix would be to have kafkaClientMetrics.bindTo(meterRegistry) be invoked in Hono after Quarkus MicrometerRecorder initialization is done. That bindTo method is invoked in the Hono MicrometerKafkaClientMetricsSupport.registerKafkaConsumer() and registerKafkaProducer() methods.

@calohmn calohmn added this to the 2.1.0 milestone Jul 5, 2022
@calohmn calohmn self-assigned this Jul 5, 2022
calohmn added a commit to bosch-io/hono that referenced this issue Jul 6, 2022
…erRecorder.

Combine Kafka metrics in a MeterBinder bean so that binding of the
metrics to the MeterRegistry is done implicitly via the Quarkus
MicrometerRecorder. This ensures that the binding is done *after*
the MeterFilter beans get added to the registry. This in turn
ensures that the KafkaMetrics are aware of the commonTags
MeterFilters when being bound to the registry.

Signed-off-by: Carsten Lohmann <carsten.lohmann@bosch.io>
calohmn added a commit that referenced this issue Jul 6, 2022
Combine Kafka metrics in a MeterBinder bean so that binding of the
metrics to the MeterRegistry is done implicitly via the Quarkus
MicrometerRecorder. This ensures that the binding is done *after*
the MeterFilter beans get added to the registry. This in turn
ensures that the KafkaMetrics are aware of the commonTags
MeterFilters when being bound to the registry.

Signed-off-by: Carsten Lohmann <carsten.lohmann@bosch.io>
calohmn added a commit that referenced this issue Jul 6, 2022
Signed-off-by: Carsten Lohmann <carsten.lohmann@bosch.io>
calohmn added a commit that referenced this issue Jul 6, 2022
Combine Kafka metrics in a MeterBinder bean so that binding of the
metrics to the MeterRegistry is done implicitly via the Quarkus
MicrometerRecorder. This ensures that the binding is done *after*
the MeterFilter beans get added to the registry. This in turn
ensures that the KafkaMetrics are aware of the commonTags
MeterFilters when being bound to the registry.

Signed-off-by: Carsten Lohmann <carsten.lohmann@bosch.io>
(cherry picked from commit df05c88)
calohmn added a commit that referenced this issue Jul 6, 2022
Signed-off-by: Carsten Lohmann <carsten.lohmann@bosch.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant