Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch v1 collector pipeline to v2 Writer #6491

Merged
merged 12 commits into from
Jan 12, 2025

Conversation

yurishkuro
Copy link
Member

@yurishkuro yurishkuro commented Jan 6, 2025

Which problem is this PR solving?

Description of the changes

  • Swap v1 spanWriter for v2 traceWriter in collector pipeline
  • Currently the traceWriter is provided via v1 adapter, so it's always v1 writer underneath
  • And since only v1 spans entry point is currently implemented, there is no performance impact from additional data transformations
  • However, as soon as OTLP entry point is utilized (e.g. via OTLP receiver), the ptrace.Traces batch will be handled via exporterhelp queue as a single item (not broken into individual spans) and then passed directly to the writer as a batch. Since the writer is implemented via adapter the batch will be converted to spans and written one span at a time. There will be no additional data transformations on this path either.

How was this change tested?

  • CI

Outstanding

  • Invoking proper preprocessing, like sanitizers and collector tags, on the OTLP path
  • Adequate metrics parity, ideally same as v1 collector
  • Test coverage, including passing a v2-like (mock) writer that cannot be downgraded to v1
    • Idea: parameterize some tests (ideally those that also validate pre-processing) to execute both v1 and v2 write paths

Follow-up PRs

  • Enable v2 write path from OTLP and Zipkin receivers (they currently explicitly downgrade to v1). This will also allow adding better unit tests.

Signed-off-by: Yuri Shkuro <github@ysh.us>
Copy link

codecov bot commented Jan 6, 2025

Codecov Report

Attention: Patch coverage is 86.23853% with 15 lines in your changes missing coverage. Please review.

Project coverage is 96.25%. Comparing base (97a9f06) to head (29f70eb).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
cmd/collector/app/span_processor.go 83.14% 9 Missing and 6 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #6491       +/-   ##
===========================================
+ Coverage   50.22%   96.25%   +46.02%     
===========================================
  Files         188      372      +184     
  Lines       11403    21360     +9957     
===========================================
+ Hits         5727    20559    +14832     
+ Misses       5218      610     -4608     
+ Partials      458      191      -267     
Flag Coverage Δ
badger_v1 10.65% <16.66%> (-0.03%) ⬇️
badger_v2 2.78% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v1-manual 16.55% <16.66%> (-0.03%) ⬇️
cassandra-4.x-v2-auto 2.71% <0.00%> (-0.01%) ⬇️
cassandra-4.x-v2-manual 2.71% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v1-manual 16.55% <16.66%> (-0.03%) ⬇️
cassandra-5.x-v2-auto 2.71% <0.00%> (-0.01%) ⬇️
cassandra-5.x-v2-manual 2.71% <0.00%> (-0.01%) ⬇️
elasticsearch-6.x-v1 20.23% <16.66%> (-0.02%) ⬇️
elasticsearch-7.x-v1 20.30% <16.66%> (-0.02%) ⬇️
elasticsearch-8.x-v1 20.46% <16.66%> (-0.03%) ⬇️
elasticsearch-8.x-v2 2.78% <0.00%> (-0.01%) ⬇️
grpc_v1 12.18% <16.66%> (-0.03%) ⬇️
grpc_v2 9.04% <0.00%> (-0.01%) ⬇️
kafka-3.x-v1 10.34% <16.66%> (-0.03%) ⬇️
kafka-3.x-v2 2.78% <0.00%> (-0.01%) ⬇️
memory_v2 2.78% <0.00%> (+<0.01%) ⬆️
opensearch-1.x-v1 20.35% <16.66%> (-0.03%) ⬇️
opensearch-2.x-v1 20.34% <16.66%> (-0.03%) ⬇️
opensearch-2.x-v2 2.78% <0.00%> (+<0.01%) ⬆️
tailsampling-processor 0.51% <0.00%> (-0.01%) ⬇️
unittests 95.13% <86.23%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yurishkuro and others added 7 commits January 6, 2025 17:43
Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@yurishkuro yurishkuro marked this pull request as ready for review January 11, 2025 21:55
@yurishkuro yurishkuro requested a review from a team as a code owner January 11, 2025 21:55
Copy link
Collaborator

@mahadzaryab1 mahadzaryab1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great - just a couple of questions

@@ -116,7 +117,7 @@ func TestNewCollector(t *testing.T) {
ServiceName: "collector",
Logger: logger,
MetricsFactory: baseMetrics,
SpanWriter: spanWriter,
TraceWriter: v1adapter.NewTraceWriter(spanWriter),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the wrapping the existing spanWriter, is it possible for spanWriter to be natively of type tracestore.Writer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have a large knock-off effect on many tests, which are later introspecting the data in v1 model. I don't think it's worth it. In the follow-up PRs where we actually enable native v2 writer in the handlers there might be other tests for v2 code paths.

Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
}

sp.queue.StartConsumers(sp.numWorkers, func(item queueItem) {
sp.processItemFromQueue(item)
})

err = sp.otelExporter.Start(context.Background(), sp.telset.Host)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro what's the reason we're starting a new context here rather than propagating it from upstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same reason, trying to minimize the changes

@mahadzaryab1
Copy link
Collaborator

we have some missing code coverage - are these code paths not testable?

@yurishkuro
Copy link
Member Author

there are two forms of missing coverage, one is for errors coming from OTEL constructor, which is quite difficult to induce, and the other is from the writer errors - those will get added once we actually start using the v2 writer.

@yurishkuro yurishkuro merged commit 77c5f2a into jaegertracing:main Jan 12, 2025
53 of 54 checks passed
@yurishkuro yurishkuro deleted the process-otlp branch January 12, 2025 21:30
yurishkuro added a commit that referenced this pull request Jan 13, 2025
## Which problem is this PR solving?
- Context is lost in the process, see
#6491 (comment)

## Description of the changes
- Add Context argument to handler and span processor methods

## How was this change tested?
- CI

---------

Signed-off-by: Yuri Shkuro <github@ysh.us>
ekefan pushed a commit to ekefan/jaeger that referenced this pull request Jan 14, 2025
## Which problem is this PR solving?
- Part of jaegertracing#6487
- Part of jaegertracing#6474

## Description of the changes
- Swap v1 spanWriter for v2 traceWriter in collector pipeline
- Currently the traceWriter is provided via v1 adapter, so it's always
v1 writer underneath
- And since only v1 spans entry point is currently implemented, there is
no performance impact from additional data transformations
- However, as soon as OTLP entry point is utilized (e.g. via OTLP
receiver), the `ptrace.Traces` batch will be handled via exporterhelp
queue as a single item (not broken into individual spans) and then
passed directly to the writer as a batch. Since the writer is
implemented via adapter the batch will be converted to spans and written
one span at a time. There will be no additional data transformations on
this path either.

## How was this change tested?
- CI

## Outstanding
- [x] Invoking proper preprocessing, like sanitizers and collector tags,
on the OTLP path
- [x] Adequate metrics parity, ideally same as v1 collector
- [ ] Test coverage, including passing a v2-like (mock) writer that
cannot be downgraded to v1
- Idea: parameterize some tests (ideally those that also validate
pre-processing) to execute both v1 and v2 write paths

## Follow-up PRs
* Enable v2 write path from OTLP and Zipkin receivers (they currently
explicitly downgrade to v1). This will also allow adding better unit
tests.

---------

Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>
ekefan pushed a commit to ekefan/jaeger that referenced this pull request Jan 14, 2025
## Which problem is this PR solving?
- Context is lost in the process, see
jaegertracing#6491 (comment)

## Description of the changes
- Add Context argument to handler and span processor methods

## How was this change tested?
- CI

---------

Signed-off-by: Yuri Shkuro <github@ysh.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants