Reduce num default tvu threads from 8 to 1 #5134

steviez · 2025-03-04T07:45:40Z

Refresh of #998

Problem

We currently create 8 threads that solely try to pull packets out from sockets associated with the turbine port. Multiple threads were added to mitigate buffer receive errors. With improvements in the software including the use of recvmmsg, using 8 threads is overkill and we can read from this port plenty fast with a single thread

Summary of Changes

The value was already configurable with a hidden CLI arg; simply decrease the default from 8 to 1 now:

agave/validator/src/cli/thread_args.rs

Line 308 in 18b49da

solana_gossip::cluster_info::DEFAULT_NUM_TVU_SOCKETS.get()

Testing

For a basic sanity check, I ran bench-streamer. With the default settings of 4 producers / 1 receiver, I see that the receiver can pull > 900k packets / second.

With this known, I then setup my node to generate additional load to itself on the TVU port. Since we're only exercising the ability for our node to pull packets out of the socket buffer, I crafted the packets such that the shred sigverify pipeline would throw the packets out prior to doing an actual sigverify. The below graph shows the following:

Orange - shred_sigverify.num_packets - I divided by two to get packets / second (2 second metric interval)
Red - shred_sigverify.num_discards_pre - divided by two again
Blue - net-stats-validator.rcvbuf_errors_delta - I multiplied by 100k

So, my node is receiving ~375k packets per second at this port with 0 dropped packets. The max number of unique shreds per second can be derive from the max number of shreds per block:

(32_768 data_shreds_per_block + 32_768 coding_shreds_per_block) * 2.5 blocks_per_second = 163_840 shreds_per_second

My guess is that the node can handle higher too; I'll push it a bit more tomorrow. Lastly, it should be noted that I'm doing the load gen on the same machine, so the load gen is "stealing resource" from validator in some sense.

Performance Gains

1 thread instead of 8 is obviously a win if we keep performance flat; however, improving perf is an added win.

Shred Sigverify

At the top of the funnel, I see a reduction in mean shred_sigverify.elapsed_micros:

The blue graph is a node that has been running tip of master
- The spike is when that node restarted
The purple node started running this branch around 2025/03/04 06:00
- The massive spikes are from restart + me generating artificial traffic as I described above

The blue node was spending less time here before the purple node got this branch; now the purple node is comparable to blue. The blue node serves as a nice comparison, but in raw numbers, this looks like a 15-20% improvement with this branch.

Shred Insertion

I see a drop in total amount of time spent in shred insertion; this is because we're now calling Blockstore::insert_shreds() fewer times (with more shreds per cal) so paying the cost of the overhead less

My node is unstaked so not sending shreds to anyone, but might be some gains there. Also, some other minor residual gains like in WindowService but that is shrinking an already small number

We currently create 8 threads that solely try to pull packets out from sockets associated with the turbine port. Multiple threads were added to mitigate buffer receive errors. With improvements in the software including the use of recvmmsg, using 8 threads is overkill and we can read from this port plenty fast with a single thread

steviez mentioned this pull request Mar 4, 2025

Reduce default from 8 to 1 TVU receive socket/threads #998

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce num default tvu threads from 8 to 1 #5134

Reduce num default tvu threads from 8 to 1 #5134

steviez commented Mar 4, 2025 •

edited

Loading

Reduce num default tvu threads from 8 to 1 #5134

Are you sure you want to change the base?

Reduce num default tvu threads from 8 to 1 #5134

Conversation

steviez commented Mar 4, 2025 • edited Loading

Problem

Summary of Changes

Testing

Performance Gains

Shred Sigverify

Shred Insertion

steviez commented Mar 4, 2025 •

edited

Loading