Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling the ZipkinV2 span error tag conversion to OTLP #2214

Closed
ericmustin opened this issue Nov 25, 2020 · 8 comments
Closed

Handling the ZipkinV2 span error tag conversion to OTLP #2214

ericmustin opened this issue Nov 25, 2020 · 8 comments
Assignees
Labels
area:receiver bug Something isn't working priority:p1 High release:required-for-ga Must be resolved before GA release

Comments

@ericmustin
Copy link
Contributor

Describe the bug
This is a continuation of a discussion in the collector contrib repo issues, here open-telemetry/opentelemetry-collector-contrib#1644 (comment). To summarize:

The zipkin translation code used to rely on the error tag, see: https://github.com/open-telemetry/opentelemetry-collector/pull/1002/files#diff-e8c9e4b66e4631dea1b8be1094a17e31ec8037ba033074ce310341bbdf9860a3L495

but now relies on different criteria, depending on whether using zipkinv2 or zipkinv1 format:

  • zipkinv1 translation looks to rely on error tag: here

  • but v2 does not: here

A user is using the signalfx-nodejs-tracing client, it appears they export zipkin in v2 format according to their documentation:

The SignalFx-Tracing Library for JavaScript is a fork of the DataDog APM JavaScript Tracer that has been modified to provide Zipkin v2 JSON formatting, B3 trace propagation functionality, and properly annotated trace data for handling by SignalFx Microservices APM.

So, it would appear the issue is with either the sfx node tracing client definition of the zipkinv2 format, or the collector's zipkinv2 => internal translation logic.

Steps to reproduce
Simply send errors generated by the sfx node client to an otel collector. Here's an example client and otel config
https://gist.github.com/ericmustin/3a66bf36840aadd4ccd58ebf8e7278c7

What did you expect to see?
These should be translated to have a Span.Status() representing an error

What did you see instead?
No span status representing an error

What version did you use?
v0.15 of collector

What config did you use?
See reproduction gist

Additional context

I woud like to determine next steps here and narrow down whether:

  1. The OpenTelemetry translation of zipkinv2 spans to OTLP is semantically correct and no changes need to be made within the OpenTelemetry-Collector

  2. The OpenTelemetry translation of zipkinv2 spans to OTLP is Incorrect and needs to be changed to account for the error tag.

  3. The SignalFx-Nodejs-Tracing client is not correctly exporting traces in Zipkinv2 format

As @owais is a maintainer of both this library and the SignalFx Library, his input is greatly appreciated here, along with @rmfitzpatrick who is quite active on both libraries as well. If there are no plans to make any changes to either the collector translation logic or the vendor client as well, that is OK too but it would be good to know so that myself and @MrAlias can make any determinations around our own vendor exporter implementations in opentelemetry-collector-contrib, for how we want to handle this special case.

Hope that helps. Please ping if any Qs

@ericmustin ericmustin added the bug Something isn't working label Nov 25, 2020
@tigrannajaryan
Copy link
Member

The OpenTelemetry translation of zipkinv2 spans to OTLP is Incorrect and needs to be changed to account for the error tag.

@ericmustin I believe this ^^^^ is the right approach.

@owais please confirm that the "error" tag is indeed what SignalFx client libraries set when exporting in Zipkin format.

@owais
Copy link
Contributor

owais commented Dec 2, 2020

I can confirm. SFX node library may set the error tag to any string value to represent a failed operation. If the error tag is present, it should translate to failed status in Otel IMO.

@tigrannajaryan
Copy link
Member

@ericmustin we have the confirmation. I believe the next step is to fix zipkin to internal pdata translation to account for error tag. Will you be able to submit a fix?

@ericmustin
Copy link
Contributor Author

@tigrannajaryan sure i can give it a shot, i don't claim to be the cosmos on zipkin but i'll try to push something up in the next day or so

@ericmustin
Copy link
Contributor Author

ericmustin commented Dec 2, 2020

One question...It looks like semantic conventions to reflect this are being added in this PR, which is still open open-telemetry/opentelemetry-specification#1257 ... @tigrannajaryan am i ok to make a PR to the collector accounting for the error tag before the semantic convention PR gets merged?

@tigrannajaryan
Copy link
Member

@ericmustin you are right, let's wait for open-telemetry/opentelemetry-specification#1257 to be finalized.

@andrewhsu
Copy link
Member

from the sig mtg, assigning to eric

@tigrannajaryan
Copy link
Member

Fixed by #2253

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:receiver bug Something isn't working priority:p1 High release:required-for-ga Must be resolved before GA release
Projects
None yet
Development

No branches or pull requests

4 participants