-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Exemplar support for OpenTelemetry tracing #652
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few things here that don't jive with the OM spec.
README.md
Outdated
In order to do so, just implement the following interfaces: | ||
|
||
* `CounterExemplarSampler` | ||
* `GaugeExemplarSampler` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gauges do not support exemplars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
README.md
Outdated
* `CounterExemplarSampler` | ||
* `GaugeExemplarSampler` | ||
* `HistogramExemplarSampler` | ||
* `SummaryExemplarSampler` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summarys do not support exemplars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
private final double value; | ||
private final long timestampMs; | ||
|
||
public Exemplar(String traceId, String spanId, double value, long timestampMs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the data model of an examplar in OpenMetrics, they can have arbitrary labelsets as long as they're within the character limit. Timestamps are also optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
/** | ||
* Lazy value holder for a double value. | ||
* <p/> | ||
* Counters and Gauges track their current value using a DoubleAdder, so getting the value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you need the value of a metric when sampling exemplars and thus care that the implementation uses a DoubleAdder? Only the size of the observation itself should matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the CounterExemplarSampler
to this
Exemplar sample(double increment, Value newTotalValue, Exemplar previous);
The default implementation uses the increment
and ignores the newTotalValue
. However, I still kept the newTotalValue
available in the API so that users can access it if needed (e.g. start sampling after the total value exceeded a threshold).
static class SystemClock implements Clock { | ||
@Override | ||
public long currentTimeMillis() { | ||
return System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the latency impact of calling this on every observation? This tends to be a relatively slow call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation so that this is only called if OpenTelemetry tracing is available.
For the latency, I added a ExemplarsBenchmark
. Result on my local Linux laptop:
Benchmark Mode Samples Score Error Units
i.p.b.ExemplarsBenchmark.testCounter avgt 200 28.252 ± 0.312 ns/op
i.p.b.ExemplarsBenchmark.testCounterWithExemplars avgt 200 46.042 ± 0.781 ns/op
i.p.b.ExemplarsBenchmark.testCounterWithoutExemplars avgt 200 29.322 ± 0.475 ns/op
so about 17ns for the currentTimeMillis()
call.
|
||
writeTimestamp(writer, sample.timestampMs); | ||
} | ||
if (sample.exemplar != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not cover all potential valid exemplars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed with the new Exemplar
data model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not quite right, 0 is a valid timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, using Long
now and null
to indicate no timestamp.
/** | ||
* Immutable data class holding an Exemplar. | ||
*/ | ||
public class Exemplar { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest having the definition of Exemplar itself live within simpleclient, as that's where all of the rest of the data model is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ExemplarSampler
implementations use the Exemplar
data model, so if we move the model to simpleclient, we will need to move all ExemplarSampler
implementations to simpleclient as well. Currently it's not much, but if in the future people provide more sophisticated implementations it might be worthwhile to have this in a separate module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that the core data model should all live in one place, not spread across N modules that have to be pieced together by someone trying to work with this. It's kinda like having an entire module just to handle timestamps.
Personally I wouldn't have the core simpleclient depending on any other modules at all, rather make this something that the user has to add an extra module to enable, rather than silently creating the possibility for dependency hell on upgrade. simpleclient thus far has purposefully avoided any dependencies for that reason, it should always be safe to have simpleclient instrumentation in place.
This is also still not the data model of an OM exemplar, as timestamps are optional.
} | ||
writer.write('\n'); | ||
} | ||
} | ||
writer.write("# EOF\n"); | ||
} | ||
|
||
private static void writeTimestamp(Writer writer, long timestampMs) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omWriteTimestamp would avoid confusion, as this is OM specific logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
0244904
to
ef6e57f
Compare
Thanks a lot @brian-brazil for reviewing this, I appreciate it a lot. I pushed an update and I'd be happy if you could review it once more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All public methods should have javadoc.
@@ -49,7 +49,7 @@ | |||
<dependency> | |||
<groupId>io.prometheus</groupId> | |||
<artifactId>simpleclient</artifactId> | |||
<version>0.10.1-SNAPSHOT</version> | |||
<version>${project.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break some build systems, I forget which one but that's the reason all versions are explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
@@ -0,0 +1,80 @@ | |||
package io.prometheus.benchmark; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readme should be updated with the new numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd re-run all the benchmarks, so it's apples to apples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment still stands.
|
||
/** | ||
* @param increment the value added to the counter on this event | ||
* @param newTotalValue the new total counter value after adding the increment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you care about this? Increments should be independent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Here's an example that might be a valid use case: A counter representing a buffer size. If the buffer can only grow, a counter should be a valid metric. A user might decide to sample exemplars only when the buffer has exceeded a certain size, thus the exemplar sampler needs the total value.
I'm not saying that this will be common, in most cases users won't need the total value. However, if we remove it from the API there will be no chance to use it even if there are rare cases where it makes sense.
Anyway, I don't have a strong opinion on this. If you still think removing it would be better I'm happy to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interesting thing about a counter is how quickly it goes up, the absolute value doesn't matter.
A counter representing a buffer size
That's a gauge, as you care about the absolute value.
/** | ||
* Immutable data class holding an Exemplar. | ||
*/ | ||
public class Exemplar { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that the core data model should all live in one place, not spread across N modules that have to be pieced together by someone trying to work with this. It's kinda like having an entire module just to handle timestamps.
Personally I wouldn't have the core simpleclient depending on any other modules at all, rather make this something that the user has to add an extra module to enable, rather than silently creating the possibility for dependency hell on upgrade. simpleclient thus far has purposefully avoided any dependencies for that reason, it should always be safe to have simpleclient instrumentation in place.
This is also still not the data model of an OM exemplar, as timestamps are optional.
* @param labels name/value pairs. Expecting an even number of strings. The combined length of the label names and | ||
* values must not exceed 128 UTF-8 characters. Neither a label name nor a label value may be null. | ||
*/ | ||
public Exemplar(double value, String... labels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing in labels should be consistent with how it's done in Info, so a map should also be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the Exemplar
and all classes that depend on it to simpleclient
. Now only the bridges to the tracing libraries live in separate modules. I agree that it should be safe to have simpleclient instrumentation in place, and that dependencies should be avoided. I hope this is the case with the current implementation: simpleclient
does not have any runtime dependencies. OpenTelemetry is available we will load the corresponding SpanContextSupplier
, if not we won't.
We could disable exemplars by default and have the user explicitly enable them with a ExemplarConfig.enableExemplars()
call. Apart from that I don't see how to improve this. We could load SpanContextSuppliers
dynamically through a class loader, but that would yield to the same optional dependency.
I added an Exemplar
constructor with a Map
parameter to be consistent with Info
.
instance.histogramExemplarSampler = instance.noopExemplarSampler; | ||
} | ||
|
||
// Methods for internal use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they're internal, why are they public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the comment. It makes sense to have these as public methods, because a user might want to keep a copy of the original default ExemplarSampler
.
return timestampMs; | ||
} | ||
|
||
private void validateLabels(String... labels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should also check that the label names are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check to make sure label names are unique.
|
||
writeTimestamp(writer, sample.timestampMs); | ||
} | ||
if (sample.exemplar != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not quite right, 0 is a valid timestamp.
I pushed another update to make sure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense for an exemplar to be provided directly at inc/observe time, as exemplars are not limited to spans/traces.
if (labels[i] == null) { | ||
throw new IllegalArgumentException("labels[" + i + "] is null"); | ||
} | ||
if (i % 2 == 0) { // label names should be unique |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't test that the label names are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions:
- What else would you test apart from not
null
and uniqueness? - Do you see any drawbacks if we sort the exemplar labels by name? That would give them a defined order when the labels are passed as a
HashMap
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That the characters match the labelname regex.
Sorting is fine, but a HashMap doesn't have ordering. You'd want a TreeMap for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added sorting and the check for the label name regex.
@@ -0,0 +1,80 @@ | |||
package io.prometheus.benchmark; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd re-run all the benchmarks, so it's apples to apples.
<dependency> | ||
<groupId>io.opentelemetry</groupId> | ||
<artifactId>opentelemetry-api</artifactId> | ||
<version>${otel.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment applies to all versions, avoid variables. Of course it's possible the feature has been added in the meantime, as that request was some years back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the maven wrapper script so that users have a defined version of maven (specified in .mvn/wrapper/maven-wrapper.properties
) if they call ./mvnw
instead of mvn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment stands, you can't assume that users are using maven.
@@ -0,0 +1,18 @@ | |||
package io.prometheus.smoketest; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be under client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
pom.xml
Outdated
@@ -43,6 +43,7 @@ | |||
</developers> | |||
|
|||
<modules> | |||
<module>tracer_parent</module> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't follow the naming scheme of beginning with simpleclient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to simpleclient_tracer
|
||
/** | ||
* @param increment the value added to the counter on this event | ||
* @param newTotalValue the new total counter value after adding the increment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still here, it should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I removed it
/** | ||
* @param value the value to be observed. | ||
* @param bucketFrom upper boundary of the previous bucket in the histogram. | ||
* May be {@link Double#NEGATIVE_INFINITY} if there is no previous bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be
There's no uncertainty here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
README.md
Outdated
|
||
### Enabling And Disabling Exemplars | ||
|
||
By default, Exemplars are enabled if OpenTelemetry tracing is detected. You can disable this globally with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need updating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-wrote the exemplars documentation.
private static final ExemplarConfig instance = new ExemplarConfig(); | ||
|
||
private final ExemplarSampler noopExemplarSampler; | ||
private final ExemplarSampler defaultExemplarSampler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these be static? They aren't tied to an instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made them static.
} | ||
|
||
/** | ||
* Disable exemplars by default. Exemplars can still be enabled for individual metrics in the metric's builder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually work? By the time it is called, static initialisation of metrics will likely already be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the following to README.md
to clarify this
Note that
ExemplarConfig
is used when a metric is created, so changing the default has no effect on metrics that have already been created.
If we want API to change the ExemplarSampler
for existing metrics, we'd need to make access to the ExemplarSampler
thread safe, which would create unexpected synchronization in multi threaded user applications. I don't see a good way of doing this apart from requiring the user to call ExemplarConfig
before metrics are created.
650aa32
to
64c7c2b
Compare
Thanks again for your review. Apart from addressing your comments, I added
I described this in I did not find the time yet to re-run all benchmarks, I will do that before the merge. I tested all combinations of available / unavailable dependencies and all available OpenTelemetry Java versions. I didn't find any combination that breaks anything. If the OpenTelemetry tracer is detected, Exemplars for OpenTelemetry are enabled by default. I added a few options to disable this
I documented this in I like the idea that if users attach the OpenTelemetry Java agent to get traces, they will also see Exemplars by default. And I don't see why people would use distributed tracing with OpenTelemetry without having corresponding exemplars in their metrics. However, if you think that we should not enable exemplars by default if OpenTelemetry tracing is present, I am happy to turn this around and disable OpenTelemetry exemplars by default. |
README.md
Outdated
The following call will increment a counter, and create an exemplar with the specified `span_id` and `trace_id` labels: | ||
|
||
```java | ||
myCounter.incWithExemplar(Exemplar.SPAN_ID, "abcdef", Exemplar.TRACE_ID, "123456"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd include the literal string here, rather than pointing off to an unspecified class that you have to import to get to a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the TRACE_ID
String from Exemplar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to push? I'm not seeing a change here.
myHistogram.observeWithExemplar(0.12, Exemplar.SPAN_ID, "abcdef", Exemplar.TRACE_ID, "123456"); | ||
``` | ||
|
||
All methods for observing and incrementing values have `...withExemplar` equivalents. There are versions taking the exemplar labels as a `String...` as shown in the example, as well as versions taking the exemplar labels as a `Map<String, String>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about doing this as part of inc() call? This isn't Go, function overloading is possible and normal in Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot make it part of the inc()
call if we want to use varargs for the labels: inc(3)
adds 3
without explicitly specifying an exemplar, while incWithExemplar(3)
adds 3
and creates an exemplar without labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone wanted an explicitly empty exemplar, they could pass an empty map.
@@ -0,0 +1,80 @@ | |||
package io.prometheus.benchmark; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment still stands.
<dependency> | ||
<groupId>io.opentelemetry</groupId> | ||
<artifactId>opentelemetry-api</artifactId> | ||
<version>${otel.version}</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment stands, you can't assume that users are using maven.
private final double value; | ||
private final Long timestampMs; | ||
|
||
private static final Pattern labelNameRegex = Pattern.compile("[a-zA-Z_]\\w*"); // \w is [a-zA-Z_0-9] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment only confuses matters, if you want to say [a-zA-Z_0-9]
then say [a-zA-Z_0-9]
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Renamed
io.prometheus.benchmark
toio.prometheus.client.benchmark
- Regarding Maven variables: I tested the current build with the oldest 3.0 and 3.1 versions available on the Apache mirror (
3.0.5
from 2013-02-23 and3.1.1
from 2013-10-04). I had to downgrade thefailsafe
plugin, but apart from that the build runs without issues with these versions. Does that address your concerns? - Changed the regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that address your concerns?
No, the problem was never maven. It was some other Java build tool that was reading the poms.
* or they can be provided explicitly for individual observations with the {@code ...withExemplar()} methods. | ||
*/ | ||
public static void disableExemplarSamplers() { | ||
enabled = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still racey, you don't know when this could be called relative to when Counters and Histograms are instantiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a line to the JavaDoc saying that this must be called before a Counter
or Histogram
is created.
I'm not sure what the best trade-off is. From a usability perspective it would be better if this disabled Exemplars also for previously created Counter
or Histogram
metrics. However, in order to implement this we would need to make access to the exemplar sampler thread safe, which might severely slow down multi-threaded use of the metrics. Therefore I think the best option is to require the user to set the default exemplar behavior before creating metrics. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a line to the JavaDoc saying that this must be called before a Counter or Histogram is created.
That's not possible for users to do, as metrics are usually defined statically and thus users have no control over it.
However, in order to implement this we would need to make access to the exemplar sampler thread safe, which might severely slow down multi-threaded use of the metrics.
If you only want to read the current state of the sampler, that should always be fast as you don't need mutexes. It's only when mixing reads and writes that things can get a tad slow.
} | ||
} | ||
|
||
private void assertOldFormat(String line) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old and New here is a bit confusing, I'd suggest using their actual names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed it.
metrics.contains("\n" + line)); | ||
} | ||
|
||
private String timestampString() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you creating a copy of this function in the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the copy and used the version from TextFormat
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to push.
README.md
Outdated
The following call will increment a counter, and create an exemplar with the specified `span_id` and `trace_id` labels: | ||
|
||
```java | ||
myCounter.incWithExemplar(Exemplar.SPAN_ID, "abcdef", Exemplar.TRACE_ID, "123456"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to push? I'm not seeing a change here.
Sorry, I removed the reference to
Ok, then I'll refactor this such that you can change the exemplar sampler for existing metrics. |
I pushed another update. Now changes in Regarding the variables in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks roughly right.
However, if you think there are users out there using such a tool I am happy to replace the variables with the version numbers.
As I said it's a complaint I got years ago, for which I forget the exact details. It's your call.
README.md
Outdated
|
||
### Per Observation Exemplars | ||
|
||
You can explicitly create an exemplar for an individual observation. This takes precedence over the exemplar sampler configured with the metric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/create/provide/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
@@ -0,0 +1,10 @@ | |||
FROM openjdk:11-jre | |||
RUN mkdir -p /app/ | |||
ADD https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v1.1.0/opentelemetry-javaagent-all.jar /app/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.2.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that the agent release lags a few days behind the library release. Agent is still 1.1.0.
<artifactId>simpleclient</artifactId> | ||
<version>${project.version}</version> | ||
<exclusions> | ||
<!-- uncomment the following to verify manually that simpleclient still works if the tracers are excluded --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be another integration test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I turned this into an integration test. Instead of excluding dependencies in the pom.xml
, the new implementation of ExemplarsOtelIT
copies the individual JARs into the Docker image so that they can be removed in the test without rebuilding the Maven project.
While I was at it, I also renamed the tracer_...
sub-modules to simpleclient_tracer_...
.
/** | ||
* Disable exemplars. | ||
* <p> | ||
* This will disable exemplars for this counter even if exemplars are enabled by default in {@link ExemplarConfig}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads as though it prevents all use of exemplars, whereas it's only the automatic ones that it stops. This could be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the wording.
assert004Format("no_labels_default_exemplar_total 3.0\n"); | ||
noLabelsDefaultExemplar.inc(2); | ||
// The TestExemplarSampler always produces a new Exemplar. | ||
// The Exemplar with value 3.0 should be replaced with an Exemplar with value 5.0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment isn't right, the exemplar value is 2. It's the counter value that is 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
benchmark/README.md
Outdated
i.p.b.CounterBenchmark.prometheusSimpleCounterNoLabelsIncBenchmark avgt 5 13.131 ± 1.661 ns/op | ||
i.p.c.b.CounterBenchmark.codahaleCounterIncBenchmark avgt 5 8.063 ± 0.440 ns/op | ||
i.p.c.b.CounterBenchmark.codahaleMeterMarkBenchmark avgt 5 38.625 ± 2.111 ns/op | ||
i.p.c.b.CounterBenchmark.prometheusCounterChildIncBenchmark avgt 5 9.869 ± 0.252 ns/op |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally, we can remove these old Java client benchmarks. They've not been relevant in many years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean remove from README.md
, or remove the implementations as well? Are there benchmarks that are still relevant apart from the ExemplarsBenchmark
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementations for the old java client. The benchmarks are still relevant, that client isn't however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed the old client from the benchmarks.
I removed the obsolete benchmarks, and I bumped the OpenTelemetry agent version as 1.2.0 was released today. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it a deeper review, and found a few nits.
OTEL_EXEMPLARS.md
Outdated
request_duration_histogram_bucket{path="/god-of-fire",le="0.004"} 4.0 # {trace_id="043cd631811e373e4180a678c06b128e",span_id="cd122e457d2ca5b0"} 0.0033 1618261159.027 | ||
``` | ||
|
||
Note that this is an example application for a unit test, so durations don't represent real durations, and some example metrics might not make sense in the real world. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/unit test/demonstration/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
OTEL_EXEMPLARS.md
Outdated
|
||
## Disabling OpenTelemetry Exemplars | ||
|
||
If you use OpenTelemetry tracing but don't want Exemplars, you can disable OpenTelemetries in multiple ways. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not
Contractions are harder to deal with for non-native speakers apparently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
OTEL_EXEMPLARS.md
Outdated
### Disable OpenTelemetry Exemplars at Runtime | ||
|
||
If your application uses OpenTelemetry tracing, but you want to disable OpenTelemetry at runtime without changing code, | ||
start your application with the `otelExemplars` system property: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be namespaced a bit more, so that it's tied to this client. As-is I'd presume from the name that it was an OTel internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefixed with io.prometheus
.
OTEL_EXEMPLARS.md
Outdated
java -DotelExemplars=inactive -jar my-application.jar | ||
``` | ||
|
||
Alternatively, you can set the environment variable `OTEL_EXEMPLARS=inactive`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend sticking with one way of doing things to avoid confusion. System properties are the way to do this in Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the environment variable.
} | ||
OpenTelemetrySpanContextSupplier test = new OpenTelemetrySpanContextSupplier(); | ||
test.getSpanId(); | ||
test.getSpanId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be getTraceId?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. Fixed.
} | ||
OpenTelemetryAgentSpanContextSupplier test = new OpenTelemetryAgentSpanContextSupplier(); | ||
test.getSpanId(); | ||
test.getSpanId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Thanks a lot for reviewing again. I pushed another update. What do you think? |
👍 |
602a5a0
to
603a9df
Compare
Thanks a lot for your great review! |
New Section in README:
Exemplars are a feature of the OpenMetrics format that allows applications to link metrics
to example traces. Starting with version 0.11.0,
client_java
automatically adds Exemplars for applications thatare instrumented with OpenTelemetry distributed tracing. No code change is required.
Running an Exemplars Example
If you want to see this in action, you can run the example from the
ExemplarsClientJavaIT
:Now you have a Spring REST service running on http://localhost:8080/hello that is instrumented with the OpenTelemetry Java agent.
In order to get metrics in OpenMetrics format, run
You should see metrics with Exemplars, for example in the
request_duration_histogram
metric:Note that this is an example application for a unit test, so durations don't represent real durations, and some example metrics might not make sense in the real world.
Enabling And Disabling Exemplars
By default, Exemplars are enabled if OpenTelemetry tracing is detected. You can disable this globally with:
If you want to enable Exemplars only for a single metric, use
withExemplarSampler()
in the metric builder.Likewise, you can disable Exemplars for a single metric like this:
See
ExemplarTest
for more examples.Implement Your Own Exemplar Sampling Algorithm
The
DefaultExemplarSampler
is very simple: It samples a new Exemplar if the current one is older than 7 seconds.You might want to implement your own Exemplar sampler that provides more interesting Exemplars for your application.
In order to do so, just implement the following interfaces:
CounterExemplarSampler
GaugeExemplarSampler
HistogramExemplarSampler
SummaryExemplarSampler
You can set your implementations as default like this:
Implement Support for Other Tracing Vendors
Version 0.11.0 implements support for OpenTelemetry. If you are a vendor for another distributed tracer,
please create a pull request adding support for your system. The idea is that copy-and-paste the
tracer_otel
module and modify it for your needs. If this turns out to be too simplistic, please createa GitHub issue and let us know what you need.
Implementation Notes
This PR also introduces integration tests with testcontainers, which is a framework for running tests against Docker containers. Benefits:
ExemplarsClientJavaIT
where we attach the OpenTelemetry Java agent to the system under test.JavaVersionsIT
that can be used to ensure the library is compatible with Java 6 or IBM's OpenJ9.That means
docker
must be installed on the system where you runmvn verify
.