Switch v1 collector pipeline to v2 Writer #6491

yurishkuro · 2025-01-06T04:49:48Z

Which problem is this PR solving?

Part of Use OTEL exporterhelper to implement a 2nd queue in v1 collector for OTLP data #6487
Part of Optimizing the write path for mixed storage v1/v2 state #6474

Description of the changes

Swap v1 spanWriter for v2 traceWriter in collector pipeline
Currently the traceWriter is provided via v1 adapter, so it's always v1 writer underneath
And since only v1 spans entry point is currently implemented, there is no performance impact from additional data transformations
However, as soon as OTLP entry point is utilized (e.g. via OTLP receiver), the ptrace.Traces batch will be handled via exporterhelp queue as a single item (not broken into individual spans) and then passed directly to the writer as a batch. Since the writer is implemented via adapter the batch will be converted to spans and written one span at a time. There will be no additional data transformations on this path either.

How was this change tested?

CI

Outstanding

Invoking proper preprocessing, like sanitizers and collector tags, on the OTLP path
Adequate metrics parity, ideally same as v1 collector
Test coverage, including passing a v2-like (mock) writer that cannot be downgraded to v1
- Idea: parameterize some tests (ideally those that also validate pre-processing) to execute both v1 and v2 write paths

Follow-up PRs

Enable v2 write path from OTLP and Zipkin receivers (they currently explicitly downgrade to v1). This will also allow adding better unit tests.

Signed-off-by: Yuri Shkuro <github@ysh.us>

codecov · 2025-01-06T04:56:39Z

Codecov Report

Attention: Patch coverage is 86.23853% with 15 lines in your changes missing coverage. Please review.

Project coverage is 96.25%. Comparing base (97a9f06) to head (29f70eb).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
cmd/collector/app/span_processor.go	83.14%	9 Missing and 6 partials ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #6491       +/-   ##
===========================================
+ Coverage   50.22%   96.25%   +46.02%     
===========================================
  Files         188      372      +184     
  Lines       11403    21360     +9957     
===========================================
+ Hits         5727    20559    +14832     
+ Misses       5218      610     -4608     
+ Partials      458      191      -267

Flag	Coverage Δ
badger_v1	`10.65% <16.66%> (-0.03%)`	⬇️
badger_v2	`2.78% <0.00%> (-0.01%)`	⬇️
cassandra-4.x-v1-manual	`16.55% <16.66%> (-0.03%)`	⬇️
cassandra-4.x-v2-auto	`2.71% <0.00%> (-0.01%)`	⬇️
cassandra-4.x-v2-manual	`2.71% <0.00%> (-0.01%)`	⬇️
cassandra-5.x-v1-manual	`16.55% <16.66%> (-0.03%)`	⬇️
cassandra-5.x-v2-auto	`2.71% <0.00%> (-0.01%)`	⬇️
cassandra-5.x-v2-manual	`2.71% <0.00%> (-0.01%)`	⬇️
elasticsearch-6.x-v1	`20.23% <16.66%> (-0.02%)`	⬇️
elasticsearch-7.x-v1	`20.30% <16.66%> (-0.02%)`	⬇️
elasticsearch-8.x-v1	`20.46% <16.66%> (-0.03%)`	⬇️
elasticsearch-8.x-v2	`2.78% <0.00%> (-0.01%)`	⬇️
grpc_v1	`12.18% <16.66%> (-0.03%)`	⬇️
grpc_v2	`9.04% <0.00%> (-0.01%)`	⬇️
kafka-3.x-v1	`10.34% <16.66%> (-0.03%)`	⬇️
kafka-3.x-v2	`2.78% <0.00%> (-0.01%)`	⬇️
memory_v2	`2.78% <0.00%> (+<0.01%)`	⬆️
opensearch-1.x-v1	`20.35% <16.66%> (-0.03%)`	⬇️
opensearch-2.x-v1	`20.34% <16.66%> (-0.03%)`	⬇️
opensearch-2.x-v2	`2.78% <0.00%> (+<0.01%)`	⬆️
tailsampling-processor	`0.51% <0.00%> (-0.01%)`	⬇️
unittests	`95.13% <86.23%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Yuri Shkuro <github@ysh.us>

Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>

Signed-off-by: Yuri Shkuro <github@ysh.us>

mahadzaryab1

looks great - just a couple of questions

mahadzaryab1 · 2025-01-12T17:15:01Z

cmd/collector/app/collector_test.go

@@ -116,7 +117,7 @@ func TestNewCollector(t *testing.T) {
 		ServiceName:      "collector",
 		Logger:           logger,
 		MetricsFactory:   baseMetrics,
-		SpanWriter:       spanWriter,
+		TraceWriter:      v1adapter.NewTraceWriter(spanWriter),


instead of the wrapping the existing spanWriter, is it possible for spanWriter to be natively of type tracestore.Writer?

This will have a large knock-off effect on many tests, which are later introspecting the data in v1 model. I don't think it's worth it. In the follow-up PRs where we actually enable native v2 writer in the handlers there might be other tests for v2 code paths.

cmd/collector/app/metrics.go

storage_v2/v1adapter/writer.go

Signed-off-by: Yuri Shkuro <github@ysh.us>

mahadzaryab1 · 2025-01-12T20:22:36Z

cmd/collector/app/span_processor.go

 	}

 	sp.queue.StartConsumers(sp.numWorkers, func(item queueItem) {
 		sp.processItemFromQueue(item)
 	})

+	err = sp.otelExporter.Start(context.Background(), sp.telset.Host)


@yurishkuro what's the reason we're starting a new context here rather than propagating it from upstream?

same reason, trying to minimize the changes

mahadzaryab1 · 2025-01-12T20:24:06Z

we have some missing code coverage - are these code paths not testable?

yurishkuro · 2025-01-12T21:30:21Z

there are two forms of missing coverage, one is for errors coming from OTEL constructor, which is quite difficult to induce, and the other is from the writer errors - those will get added once we actually start using the v2 writer.

## Which problem is this PR solving? - Context is lost in the process, see #6491 (comment) ## Description of the changes - Add Context argument to handler and span processor methods ## How was this change tested? - CI --------- Signed-off-by: Yuri Shkuro <github@ysh.us>

## Which problem is this PR solving? - Part of jaegertracing#6487 - Part of jaegertracing#6474 ## Description of the changes - Swap v1 spanWriter for v2 traceWriter in collector pipeline - Currently the traceWriter is provided via v1 adapter, so it's always v1 writer underneath - And since only v1 spans entry point is currently implemented, there is no performance impact from additional data transformations - However, as soon as OTLP entry point is utilized (e.g. via OTLP receiver), the `ptrace.Traces` batch will be handled via exporterhelp queue as a single item (not broken into individual spans) and then passed directly to the writer as a batch. Since the writer is implemented via adapter the batch will be converted to spans and written one span at a time. There will be no additional data transformations on this path either. ## How was this change tested? - CI ## Outstanding - [x] Invoking proper preprocessing, like sanitizers and collector tags, on the OTLP path - [x] Adequate metrics parity, ideally same as v1 collector - [ ] Test coverage, including passing a v2-like (mock) writer that cannot be downgraded to v1 - Idea: parameterize some tests (ideally those that also validate pre-processing) to execute both v1 and v2 write paths ## Follow-up PRs * Enable v2 write path from OTLP and Zipkin receivers (they currently explicitly downgrade to v1). This will also allow adding better unit tests. --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>

## Which problem is this PR solving? - Context is lost in the process, see jaegertracing#6491 (comment) ## Description of the changes - Add Context argument to handler and span processor methods ## How was this change tested? - CI --------- Signed-off-by: Yuri Shkuro <github@ysh.us>

Switch v1 collector pipeline to v2 Writer

7352257

Signed-off-by: Yuri Shkuro <github@ysh.us>

yurishkuro added the changelog:bugfix-or-minor-feature label Jan 6, 2025

yurishkuro and others added 7 commits January 6, 2025 17:43

Merge branch 'main' into process-otlp

280cab1

add-tests

324886e

Signed-off-by: Yuri Shkuro <github@ysh.us>

fix

26a8574

Signed-off-by: Yuri Shkuro <github@ysh.us>

fix

98be3d7

Signed-off-by: Yuri Shkuro <github@ysh.us>

Merge branch 'main' into process-otlp

4fd6e3c

Signed-off-by: Yuri Shkuro <yurishkuro@users.noreply.github.com>

Add metrics

6d5f5cf

Signed-off-by: Yuri Shkuro <github@ysh.us>

fix

69d38d0

Signed-off-by: Yuri Shkuro <github@ysh.us>

yurishkuro marked this pull request as ready for review January 11, 2025 21:55

yurishkuro requested a review from a team as a code owner January 11, 2025 21:55

yurishkuro requested a review from joe-elliott January 11, 2025 21:55

dosubot bot added area/storage v2 labels Jan 11, 2025

yurishkuro added 2 commits January 11, 2025 20:28

Merge branch 'main' into process-otlp

d17b6a9

Merge branch 'main' into process-otlp

35072c9

mahadzaryab1 reviewed Jan 12, 2025

View reviewed changes

yurishkuro added 2 commits January 12, 2025 14:18

add test

942899c

Signed-off-by: Yuri Shkuro <github@ysh.us>

use constant

29f70eb

Signed-off-by: Yuri Shkuro <github@ysh.us>

mahadzaryab1 reviewed Jan 12, 2025

View reviewed changes

mahadzaryab1 approved these changes Jan 12, 2025

View reviewed changes

yurishkuro merged commit 77c5f2a into jaegertracing:main Jan 12, 2025
53 of 54 checks passed

yurishkuro deleted the process-otlp branch January 12, 2025 21:30

yurishkuro mentioned this pull request Jan 12, 2025

Pass Context through span processors #6534

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch v1 collector pipeline to v2 Writer #6491

Switch v1 collector pipeline to v2 Writer #6491

yurishkuro commented Jan 6, 2025 •

edited

Loading

codecov bot commented Jan 6, 2025 •

edited

Loading

mahadzaryab1 left a comment

mahadzaryab1 Jan 12, 2025

yurishkuro Jan 12, 2025

mahadzaryab1 Jan 12, 2025

yurishkuro Jan 12, 2025

mahadzaryab1 commented Jan 12, 2025

yurishkuro commented Jan 12, 2025

Switch v1 collector pipeline to v2 Writer #6491

Switch v1 collector pipeline to v2 Writer #6491

Conversation

yurishkuro commented Jan 6, 2025 • edited Loading

Which problem is this PR solving?

Description of the changes

How was this change tested?

Outstanding

Follow-up PRs

codecov bot commented Jan 6, 2025 • edited Loading

Codecov Report

mahadzaryab1 left a comment

Choose a reason for hiding this comment

mahadzaryab1 Jan 12, 2025

Choose a reason for hiding this comment

yurishkuro Jan 12, 2025

Choose a reason for hiding this comment

mahadzaryab1 Jan 12, 2025

Choose a reason for hiding this comment

yurishkuro Jan 12, 2025

Choose a reason for hiding this comment

mahadzaryab1 commented Jan 12, 2025

yurishkuro commented Jan 12, 2025

yurishkuro commented Jan 6, 2025 •

edited

Loading

codecov bot commented Jan 6, 2025 •

edited

Loading