Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add preHandler hook to fastify #6210

Merged
merged 1 commit into from
Dec 21, 2023
Merged

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Dec 20, 2023

Motivation

There are some delays publishing blocks, in the example below the publishBlockV2 comes first but get executed the last:

Dec-16 00:53:13.845[rest]            ^[[34mdebug^[[39m: Req req-adfi x.x.x.x publishBlockV2
Dec-16 00:53:15.025[rest]            ^[[34mdebug^[[39m: Req req-adfj x.x.x.x produceAttestationData
Dec-16 00:53:15.025[rest]            ^[[34mdebug^[[39m: Res req-adfj produceAttestationData - 200
Dec-16 00:53:15.029[rest]            ^[[34mdebug^[[39m: Req req-adfk x.x.x.x produceAttestationData
Dec-16 00:53:15.029[rest]            ^[[34mdebug^[[39m: Res req-adfk produceAttestationData - 200
Dec-16 00:53:15.040[rest]            ^[[34mdebug^[[39m: Req req-adfl x.x.x.x produceAttestationData
Dec-16 00:53:15.041[rest]            ^[[34mdebug^[[39m: Res req-adfl produceAttestationData - 200
Dec-16 00:53:15.077[rest]            ^[[34mdebug^[[39m: Req req-adfm x.x.x.x submitPoolAttestations
Dec-16 00:53:15.097[rest]            ^[[34mdebug^[[39m: Req req-adfn x.x.x.x submitPoolAttestations
Dec-16 00:53:15.115[rest]            ^[[34mdebug^[[39m: Req req-adfo x.x.x.x submitPoolAttestations
Dec-16 00:53:15.213[rest]            ^[[34mdebug^[[39m: Res req-adfo submitPoolAttestations - 200
Dec-16 00:53:15.224[rest]            ^[[34mdebug^[[39m: Res req-adfn submitPoolAttestations - 200
Dec-16 00:53:15.231[rest]            ^[[34mdebug^[[39m: Res req-adfm submitPoolAttestations - 200
Dec-16 00:53:15.394[chain]           ^[[34mdebug^[[39m: Consensus validated while publishing block broadcastValidation=consensus, blockRoot=0x85a5568c31207c70baa001f56fb5cba0e5b9ff1c29492887d798f0a8d0249b0d, bodyRoot=0xa9e3ba8bfb722730c9578a333b97e97e9a73951708262cb470febf2f617dcef0, blockLocallyProduced=true, slot=7988664

I suspect the delay could come from vc - bn latency posting the payload as publishBlockV2 is the only api with big request body

this is also caught by getResponseTime() api of fastify

Screenshot 2023-12-20 at 19 54 43

Description

  • Add preHandler hook to fastify to make it clear the delay comes from vc-bn latency or the api handler itself (which contains deserializing the request body + api impl). This hook is called right before our api handler. Log looks like:
Dec-20 08:37:38.762[rest]            debug: Req req-bu 127.0.0.1 submitPoolAttestations
Dec-20 08:37:38.762[rest]            debug: Exec req-bu 127.0.0.1 submitPoolAttestations
Dec-20 08:37:38.932[rest]            debug: Res req-bu submitPoolAttestations - 200

this should help us separate streaming time + fastify steps from our own handler time

@twoeths twoeths requested a review from a team as a code owner December 20, 2023 08:54
Copy link

codecov bot commented Dec 20, 2023

Codecov Report

Merging #6210 (643d93f) into unstable (6d1dc61) will not change coverage.
The diff coverage is n/a.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #6210   +/-   ##
=========================================
  Coverage     85.63%   85.63%           
=========================================
  Files           172      172           
  Lines         14372    14372           
  Branches        880      880           
=========================================
  Hits          12308    12308           
  Misses         2041     2041           
  Partials         23       23           

@nflaig
Copy link
Member

nflaig commented Dec 20, 2023

to make it clear the delay comes from vc-bn latency or the api handler itself

Can't we see the latency between vc-bn already if we look at the Req log from onRequest handler? Or do you suspect that the body is not fully streamed when this is called?

At least in their docs there is a note about the body being undefined, I am not sure though if this implies that it might not yet be fully streamed

Notice: in the onRequest hook, request.body will always be undefined, because the body parsing happens before the preValidation hook.

In the preParsing hook we know for sure though that it is streamed as the docs state you can modify it

If you are using the preParsing hook, you can transform the request payload stream before it is parsed. It receives the request and reply objects as other hooks, and a stream with the current request payload.

Depends on what we wanna measure I think preHandler hook it correct but it does not add latency information but as I understand it would give us information how much time it took on the fastify side to do Parsing and Validation internally.

Their docs has a great diagram to show the request lifecycle

image

Copy link
Contributor

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 682e612 Previous: 6d1dc61 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 582.48 us/op 477.06 us/op 1.22
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 124.12 us/op 85.441 us/op 1.45
BLS verify - blst-native 1.3818 ms/op 1.3061 ms/op 1.06
BLS verifyMultipleSignatures 3 - blst-native 2.8925 ms/op 2.7523 ms/op 1.05
BLS verifyMultipleSignatures 8 - blst-native 6.4597 ms/op 6.1292 ms/op 1.05
BLS verifyMultipleSignatures 32 - blst-native 23.409 ms/op 22.356 ms/op 1.05
BLS verifyMultipleSignatures 64 - blst-native 45.997 ms/op 44.101 ms/op 1.04
BLS verifyMultipleSignatures 128 - blst-native 90.639 ms/op 86.021 ms/op 1.05
BLS deserializing 10000 signatures 967.93 ms/op 942.18 ms/op 1.03
BLS deserializing 100000 signatures 9.3176 s/op 9.4869 s/op 0.98
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.4096 ms/op 1.3654 ms/op 1.03
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.5688 ms/op 1.5460 ms/op 1.01
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.5444 ms/op 3.4229 ms/op 0.74
BLS verifyMultipleSignatures - same message - 64 - blst-native 3.6117 ms/op 3.6201 ms/op 1.00
BLS verifyMultipleSignatures - same message - 128 - blst-native 5.8437 ms/op 8.2203 ms/op 0.71
BLS aggregatePubkeys 32 - blst-native 26.885 us/op 28.839 us/op 0.93
BLS aggregatePubkeys 128 - blst-native 104.96 us/op 106.33 us/op 0.99
getAttestationsForBlock 59.466 ms/op 63.800 ms/op 0.93
getSlashingsAndExits - default max 309.98 us/op 184.97 us/op 1.68
getSlashingsAndExits - 2k 529.27 us/op 609.71 us/op 0.87
proposeBlockBody type=full, size=empty 7.4657 ms/op 5.1596 ms/op 1.45
isKnown best case - 1 super set check 598.00 ns/op 322.00 ns/op 1.86
isKnown normal case - 2 super set checks 634.00 ns/op 320.00 ns/op 1.98
isKnown worse case - 16 super set checks 645.00 ns/op 314.00 ns/op 2.05
CheckpointStateCache - add get delete 6.4320 us/op 6.2910 us/op 1.02
validate api signedAggregateAndProof - struct 2.9043 ms/op 2.8432 ms/op 1.02
validate gossip signedAggregateAndProof - struct 2.9669 ms/op 2.8490 ms/op 1.04
validate gossip attestation - vc 640000 1.4485 ms/op 1.4318 ms/op 1.01
batch validate gossip attestation - vc 640000 - chunk 32 175.48 us/op 174.64 us/op 1.00
batch validate gossip attestation - vc 640000 - chunk 64 153.38 us/op 187.19 us/op 0.82
batch validate gossip attestation - vc 640000 - chunk 128 152.98 us/op 163.44 us/op 0.94
batch validate gossip attestation - vc 640000 - chunk 256 145.80 us/op 142.80 us/op 1.02
pickEth1Vote - no votes 1.3920 ms/op 1.3629 ms/op 1.02
pickEth1Vote - max votes 12.261 ms/op 12.663 ms/op 0.97
pickEth1Vote - Eth1Data hashTreeRoot value x2048 20.628 ms/op 22.618 ms/op 0.91
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 32.581 ms/op 28.098 ms/op 1.16
pickEth1Vote - Eth1Data fastSerialize value x2048 709.67 us/op 773.62 us/op 0.92
pickEth1Vote - Eth1Data fastSerialize tree x2048 8.9347 ms/op 5.1535 ms/op 1.73
bytes32 toHexString 753.00 ns/op 622.00 ns/op 1.21
bytes32 Buffer.toString(hex) 307.00 ns/op 308.00 ns/op 1.00
bytes32 Buffer.toString(hex) from Uint8Array 520.00 ns/op 537.00 ns/op 0.97
bytes32 Buffer.toString(hex) + 0x 311.00 ns/op 323.00 ns/op 0.96
Object access 1 prop 0.21200 ns/op 0.19200 ns/op 1.10
Map access 1 prop 0.16200 ns/op 0.14600 ns/op 1.11
Object get x1000 7.8870 ns/op 7.0620 ns/op 1.12
Map get x1000 0.89400 ns/op 0.83900 ns/op 1.07
Object set x1000 74.734 ns/op 61.327 ns/op 1.22
Map set x1000 54.321 ns/op 50.037 ns/op 1.09
Return object 10000 times 0.25700 ns/op 0.24570 ns/op 1.05
Throw Error 10000 times 4.0101 us/op 4.1298 us/op 0.97
fastMsgIdFn sha256 / 200 bytes 3.4830 us/op 3.4300 us/op 1.02
fastMsgIdFn h32 xxhash / 200 bytes 369.00 ns/op 341.00 ns/op 1.08
fastMsgIdFn h64 xxhash / 200 bytes 414.00 ns/op 384.00 ns/op 1.08
fastMsgIdFn sha256 / 1000 bytes 11.959 us/op 11.927 us/op 1.00
fastMsgIdFn h32 xxhash / 1000 bytes 515.00 ns/op 475.00 ns/op 1.08
fastMsgIdFn h64 xxhash / 1000 bytes 491.00 ns/op 468.00 ns/op 1.05
fastMsgIdFn sha256 / 10000 bytes 108.84 us/op 105.40 us/op 1.03
fastMsgIdFn h32 xxhash / 10000 bytes 2.1230 us/op 1.9800 us/op 1.07
fastMsgIdFn h64 xxhash / 10000 bytes 1.4680 us/op 1.3670 us/op 1.07
send data - 1000 256B messages 22.588 ms/op 20.883 ms/op 1.08
send data - 1000 512B messages 31.194 ms/op 28.669 ms/op 1.09
send data - 1000 1024B messages 45.264 ms/op 43.133 ms/op 1.05
send data - 1000 1200B messages 43.207 ms/op 39.017 ms/op 1.11
send data - 1000 2048B messages 50.318 ms/op 46.233 ms/op 1.09
send data - 1000 4096B messages 40.913 ms/op 42.571 ms/op 0.96
send data - 1000 16384B messages 127.34 ms/op 125.18 ms/op 1.02
send data - 1000 65536B messages 598.34 ms/op 471.39 ms/op 1.27
enrSubnets - fastDeserialize 64 bits 1.7040 us/op 1.2250 us/op 1.39
enrSubnets - ssz BitVector 64 bits 572.00 ns/op 406.00 ns/op 1.41
enrSubnets - fastDeserialize 4 bits 256.00 ns/op 161.00 ns/op 1.59
enrSubnets - ssz BitVector 4 bits 587.00 ns/op 409.00 ns/op 1.44
prioritizePeers score -10:0 att 32-0.1 sync 2-0 128.70 us/op 101.50 us/op 1.27
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 162.62 us/op 126.45 us/op 1.29
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 203.36 us/op 164.05 us/op 1.24
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 360.78 us/op 301.89 us/op 1.20
prioritizePeers score 0:0 att 64-1 sync 4-1 387.28 us/op 374.16 us/op 1.04
array of 16000 items push then shift 1.7319 us/op 1.6712 us/op 1.04
LinkedList of 16000 items push then shift 9.9640 ns/op 8.9480 ns/op 1.11
array of 16000 items push then pop 119.00 ns/op 102.48 ns/op 1.16
LinkedList of 16000 items push then pop 9.6010 ns/op 9.0580 ns/op 1.06
array of 24000 items push then shift 2.7846 us/op 2.4286 us/op 1.15
LinkedList of 24000 items push then shift 9.8570 ns/op 9.1320 ns/op 1.08
array of 24000 items push then pop 169.05 ns/op 139.60 ns/op 1.21
LinkedList of 24000 items push then pop 9.3660 ns/op 8.8290 ns/op 1.06
intersect bitArray bitLen 8 6.8540 ns/op 6.7790 ns/op 1.01
intersect array and set length 8 89.991 ns/op 65.353 ns/op 1.38
intersect bitArray bitLen 128 35.804 ns/op 35.439 ns/op 1.01
intersect array and set length 128 1.1351 us/op 876.97 ns/op 1.29
bitArray.getTrueBitIndexes() bitLen 128 1.7580 us/op 1.4800 us/op 1.19
bitArray.getTrueBitIndexes() bitLen 248 3.1560 us/op 2.6880 us/op 1.17
bitArray.getTrueBitIndexes() bitLen 512 6.5350 us/op 4.9400 us/op 1.32
Buffer.concat 32 items 1.0800 us/op 958.00 ns/op 1.13
Uint8Array.set 32 items 2.0990 us/op 1.7480 us/op 1.20
Set add up to 64 items then delete first 5.3215 us/op 4.3900 us/op 1.21
OrderedSet add up to 64 items then delete first 6.8080 us/op 5.4993 us/op 1.24
Set add up to 64 items then delete last 5.5471 us/op 4.6051 us/op 1.20
OrderedSet add up to 64 items then delete last 7.3912 us/op 5.7723 us/op 1.28
Set add up to 64 items then delete middle 5.6061 us/op 4.6056 us/op 1.22
OrderedSet add up to 64 items then delete middle 8.9238 us/op 7.6316 us/op 1.17
Set add up to 128 items then delete first 11.123 us/op 9.3975 us/op 1.18
OrderedSet add up to 128 items then delete first 15.724 us/op 12.052 us/op 1.30
Set add up to 128 items then delete last 11.228 us/op 9.0872 us/op 1.24
OrderedSet add up to 128 items then delete last 14.899 us/op 11.612 us/op 1.28
Set add up to 128 items then delete middle 11.109 us/op 9.0175 us/op 1.23
OrderedSet add up to 128 items then delete middle 20.741 us/op 17.378 us/op 1.19
Set add up to 256 items then delete first 22.976 us/op 18.308 us/op 1.25
OrderedSet add up to 256 items then delete first 32.249 us/op 25.208 us/op 1.28
Set add up to 256 items then delete last 22.555 us/op 18.327 us/op 1.23
OrderedSet add up to 256 items then delete last 30.537 us/op 24.033 us/op 1.27
Set add up to 256 items then delete middle 22.701 us/op 18.729 us/op 1.21
OrderedSet add up to 256 items then delete middle 54.737 us/op 46.639 us/op 1.17
transfer serialized Status (84 B) 2.0650 us/op 1.8300 us/op 1.13
copy serialized Status (84 B) 1.8030 us/op 1.5300 us/op 1.18
transfer serialized SignedVoluntaryExit (112 B) 2.0300 us/op 1.8290 us/op 1.11
copy serialized SignedVoluntaryExit (112 B) 1.7590 us/op 1.4770 us/op 1.19
transfer serialized ProposerSlashing (416 B) 2.4820 us/op 1.9540 us/op 1.27
copy serialized ProposerSlashing (416 B) 2.5230 us/op 1.8490 us/op 1.36
transfer serialized Attestation (485 B) 3.4820 us/op 2.7240 us/op 1.28
copy serialized Attestation (485 B) 3.4390 us/op 1.9560 us/op 1.76
transfer serialized AttesterSlashing (33232 B) 3.6750 us/op 2.2170 us/op 1.66
copy serialized AttesterSlashing (33232 B) 10.241 us/op 5.6350 us/op 1.82
transfer serialized Small SignedBeaconBlock (128000 B) 3.6440 us/op 2.5720 us/op 1.42
copy serialized Small SignedBeaconBlock (128000 B) 26.298 us/op 17.332 us/op 1.52
transfer serialized Avg SignedBeaconBlock (200000 B) 3.5310 us/op 2.9370 us/op 1.20
copy serialized Avg SignedBeaconBlock (200000 B) 41.503 us/op 22.174 us/op 1.87
transfer serialized BlobsSidecar (524380 B) 4.5280 us/op 2.9330 us/op 1.54
copy serialized BlobsSidecar (524380 B) 138.65 us/op 98.144 us/op 1.41
transfer serialized Big SignedBeaconBlock (1000000 B) 5.6500 us/op 3.6200 us/op 1.56
copy serialized Big SignedBeaconBlock (1000000 B) 456.81 us/op 166.44 us/op 2.74
pass gossip attestations to forkchoice per slot 4.1766 ms/op 4.5102 ms/op 0.93
forkChoice updateHead vc 100000 bc 64 eq 0 732.45 us/op 729.77 us/op 1.00
forkChoice updateHead vc 600000 bc 64 eq 0 5.2914 ms/op 5.1815 ms/op 1.02
forkChoice updateHead vc 1000000 bc 64 eq 0 7.5767 ms/op 7.1304 ms/op 1.06
forkChoice updateHead vc 600000 bc 320 eq 0 4.6168 ms/op 4.3376 ms/op 1.06
forkChoice updateHead vc 600000 bc 1200 eq 0 4.8418 ms/op 4.4175 ms/op 1.10
forkChoice updateHead vc 600000 bc 7200 eq 0 6.0432 ms/op 5.2179 ms/op 1.16
forkChoice updateHead vc 600000 bc 64 eq 1000 12.311 ms/op 11.418 ms/op 1.08
forkChoice updateHead vc 600000 bc 64 eq 10000 13.175 ms/op 12.130 ms/op 1.09
forkChoice updateHead vc 600000 bc 64 eq 300000 29.290 ms/op 15.993 ms/op 1.83
computeDeltas 500000 validators 300 proto nodes 8.2497 ms/op 6.8080 ms/op 1.21
computeDeltas 500000 validators 1200 proto nodes 9.2408 ms/op 6.7258 ms/op 1.37
computeDeltas 500000 validators 7200 proto nodes 7.0534 ms/op 6.6348 ms/op 1.06
computeDeltas 750000 validators 300 proto nodes 11.613 ms/op 10.297 ms/op 1.13
computeDeltas 750000 validators 1200 proto nodes 12.376 ms/op 10.017 ms/op 1.24
computeDeltas 750000 validators 7200 proto nodes 11.662 ms/op 10.345 ms/op 1.13
computeDeltas 1400000 validators 300 proto nodes 23.437 ms/op 25.738 ms/op 0.91
computeDeltas 1400000 validators 1200 proto nodes 22.578 ms/op 19.752 ms/op 1.14
computeDeltas 1400000 validators 7200 proto nodes 21.887 ms/op 20.283 ms/op 1.08
computeDeltas 2100000 validators 300 proto nodes 33.742 ms/op 29.409 ms/op 1.15
computeDeltas 2100000 validators 1200 proto nodes 32.504 ms/op 29.607 ms/op 1.10
computeDeltas 2100000 validators 7200 proto nodes 33.198 ms/op 30.475 ms/op 1.09
computeProposerBoostScoreFromBalances 500000 validators 4.8450 ms/op 4.1322 ms/op 1.17
computeProposerBoostScoreFromBalances 750000 validators 4.5823 ms/op 3.9190 ms/op 1.17
computeProposerBoostScoreFromBalances 1400000 validators 4.5201 ms/op 3.8170 ms/op 1.18
computeProposerBoostScoreFromBalances 2100000 validators 4.6291 ms/op 3.9122 ms/op 1.18
altair processAttestation - 250000 vs - 7PWei normalcase 4.5444 ms/op 3.9577 ms/op 1.15
altair processAttestation - 250000 vs - 7PWei worstcase 5.8506 ms/op 5.1160 ms/op 1.14
altair processAttestation - setStatus - 1/6 committees join 224.72 us/op 219.97 us/op 1.02
altair processAttestation - setStatus - 1/3 committees join 372.32 us/op 390.65 us/op 0.95
altair processAttestation - setStatus - 1/2 committees join 428.56 us/op 548.78 us/op 0.78
altair processAttestation - setStatus - 2/3 committees join 606.75 us/op 696.29 us/op 0.87
altair processAttestation - setStatus - 4/5 committees join 808.05 us/op 863.14 us/op 0.94
altair processAttestation - setStatus - 100% committees join 871.03 us/op 1.1470 ms/op 0.76
altair processBlock - 250000 vs - 7PWei normalcase 14.229 ms/op 11.988 ms/op 1.19
altair processBlock - 250000 vs - 7PWei normalcase hashState 57.566 ms/op 44.336 ms/op 1.30
altair processBlock - 250000 vs - 7PWei worstcase 45.214 ms/op 41.404 ms/op 1.09
altair processBlock - 250000 vs - 7PWei worstcase hashState 118.67 ms/op 105.32 ms/op 1.13
phase0 processBlock - 250000 vs - 7PWei normalcase 4.6559 ms/op 3.0582 ms/op 1.52
phase0 processBlock - 250000 vs - 7PWei worstcase 36.884 ms/op 33.862 ms/op 1.09
altair processEth1Data - 250000 vs - 7PWei normalcase 777.34 us/op 777.41 us/op 1.00
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 21.315 us/op 12.498 us/op 1.71
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 92.462 us/op 91.770 us/op 1.01
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 39.618 us/op 24.942 us/op 1.59
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 24.907 us/op 17.359 us/op 1.43
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 297.31 us/op 274.59 us/op 1.08
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.8774 ms/op 2.6342 ms/op 0.71
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 2.5115 ms/op 3.6711 ms/op 0.68
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 2.3122 ms/op 2.9309 ms/op 0.79
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 5.4653 ms/op 8.1992 ms/op 0.67
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 3.6954 ms/op 4.1331 ms/op 0.89
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 8.2213 ms/op 15.054 ms/op 0.55
Tree 40 250000 create 874.77 ms/op 995.28 ms/op 0.88
Tree 40 250000 get(125000) 249.31 ns/op 309.45 ns/op 0.81
Tree 40 250000 set(125000) 3.0175 us/op 1.3746 us/op 2.20
Tree 40 250000 toArray() 29.816 ms/op 25.730 ms/op 1.16
Tree 40 250000 iterate all - toArray() + loop 30.813 ms/op 28.900 ms/op 1.07
Tree 40 250000 iterate all - get(i) 87.413 ms/op 81.290 ms/op 1.08
MutableVector 250000 create 23.741 ms/op 15.328 ms/op 1.55
MutableVector 250000 get(125000) 7.7370 ns/op 6.9470 ns/op 1.11
MutableVector 250000 set(125000) 698.05 ns/op 360.51 ns/op 1.94
MutableVector 250000 toArray() 5.1864 ms/op 4.4648 ms/op 1.16
MutableVector 250000 iterate all - toArray() + loop 5.4355 ms/op 4.9187 ms/op 1.11
MutableVector 250000 iterate all - get(i) 1.7133 ms/op 1.5712 ms/op 1.09
Array 250000 create 4.9271 ms/op 4.3060 ms/op 1.14
Array 250000 clone - spread 3.3987 ms/op 1.4250 ms/op 2.39
Array 250000 get(125000) 2.8730 ns/op 1.2840 ns/op 2.24
Array 250000 set(125000) 6.1730 ns/op 5.4580 ns/op 1.13
Array 250000 iterate all - loop 192.39 us/op 179.22 us/op 1.07
effectiveBalanceIncrements clone Uint8Array 300000 83.954 us/op 60.477 us/op 1.39
effectiveBalanceIncrements clone MutableVector 300000 1.5440 us/op 476.00 ns/op 3.24
effectiveBalanceIncrements rw all Uint8Array 300000 240.62 us/op 202.56 us/op 1.19
effectiveBalanceIncrements rw all MutableVector 300000 238.29 ms/op 113.56 ms/op 2.10
phase0 afterProcessEpoch - 250000 vs - 7PWei 139.36 ms/op 118.56 ms/op 1.18
phase0 beforeProcessEpoch - 250000 vs - 7PWei 56.796 ms/op 42.520 ms/op 1.34
altair processEpoch - mainnet_e81889 685.27 ms/op 558.74 ms/op 1.23
mainnet_e81889 - altair beforeProcessEpoch 111.96 ms/op 103.09 ms/op 1.09
mainnet_e81889 - altair processJustificationAndFinalization 30.715 us/op 22.220 us/op 1.38
mainnet_e81889 - altair processInactivityUpdates 8.8470 ms/op 6.3538 ms/op 1.39
mainnet_e81889 - altair processRewardsAndPenalties 63.813 ms/op 71.975 ms/op 0.89
mainnet_e81889 - altair processRegistryUpdates 6.4830 us/op 4.2240 us/op 1.53
mainnet_e81889 - altair processSlashings 2.0260 us/op 834.00 ns/op 2.43
mainnet_e81889 - altair processEth1DataReset 1.2060 us/op 1.0800 us/op 1.12
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.6610 ms/op 1.4898 ms/op 1.79
mainnet_e81889 - altair processSlashingsReset 8.2740 us/op 4.7620 us/op 1.74
mainnet_e81889 - altair processRandaoMixesReset 10.504 us/op 6.6130 us/op 1.59
mainnet_e81889 - altair processHistoricalRootsUpdate 2.2590 us/op 960.00 ns/op 2.35
mainnet_e81889 - altair processParticipationFlagUpdates 5.9780 us/op 2.9530 us/op 2.02
mainnet_e81889 - altair processSyncCommitteeUpdates 1.4370 us/op 1.0280 us/op 1.40
mainnet_e81889 - altair afterProcessEpoch 137.77 ms/op 125.94 ms/op 1.09
capella processEpoch - mainnet_e217614 3.9218 s/op 2.8289 s/op 1.39
mainnet_e217614 - capella beforeProcessEpoch 729.20 ms/op 582.69 ms/op 1.25
mainnet_e217614 - capella processJustificationAndFinalization 35.616 us/op 23.099 us/op 1.54
mainnet_e217614 - capella processInactivityUpdates 31.300 ms/op 26.894 ms/op 1.16
mainnet_e217614 - capella processRewardsAndPenalties 560.35 ms/op 497.18 ms/op 1.13
mainnet_e217614 - capella processRegistryUpdates 56.201 us/op 44.724 us/op 1.26
mainnet_e217614 - capella processSlashings 2.0260 us/op 1.1020 us/op 1.84
mainnet_e217614 - capella processEth1DataReset 1.3490 us/op 1.2310 us/op 1.10
mainnet_e217614 - capella processEffectiveBalanceUpdates 9.0020 ms/op 7.7432 ms/op 1.16
mainnet_e217614 - capella processSlashingsReset 10.097 us/op 8.4870 us/op 1.19
mainnet_e217614 - capella processRandaoMixesReset 12.114 us/op 10.168 us/op 1.19
mainnet_e217614 - capella processHistoricalRootsUpdate 2.0650 us/op 1.5930 us/op 1.30
mainnet_e217614 - capella processParticipationFlagUpdates 4.1300 us/op 4.2690 us/op 0.97
mainnet_e217614 - capella afterProcessEpoch 428.96 ms/op 371.10 ms/op 1.16
phase0 processEpoch - mainnet_e58758 714.69 ms/op 667.97 ms/op 1.07
mainnet_e58758 - phase0 beforeProcessEpoch 230.53 ms/op 235.66 ms/op 0.98
mainnet_e58758 - phase0 processJustificationAndFinalization 34.588 us/op 35.726 us/op 0.97
mainnet_e58758 - phase0 processRewardsAndPenalties 73.674 ms/op 73.694 ms/op 1.00
mainnet_e58758 - phase0 processRegistryUpdates 23.930 us/op 28.223 us/op 0.85
mainnet_e58758 - phase0 processSlashings 1.8060 us/op 1.4660 us/op 1.23
mainnet_e58758 - phase0 processEth1DataReset 1.2950 us/op 1.0360 us/op 1.25
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 2.4353 ms/op 1.5936 ms/op 1.53
mainnet_e58758 - phase0 processSlashingsReset 8.3410 us/op 6.6120 us/op 1.26
mainnet_e58758 - phase0 processRandaoMixesReset 10.771 us/op 7.7460 us/op 1.39
mainnet_e58758 - phase0 processHistoricalRootsUpdate 1.2090 us/op 1.2040 us/op 1.00
mainnet_e58758 - phase0 processParticipationRecordUpdates 9.2910 us/op 7.1980 us/op 1.29
mainnet_e58758 - phase0 afterProcessEpoch 122.30 ms/op 101.39 ms/op 1.21
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.8709 ms/op 1.4472 ms/op 1.98
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 2.9948 ms/op 1.6279 ms/op 1.84
altair processInactivityUpdates - 250000 normalcase 53.232 ms/op 23.822 ms/op 2.23
altair processInactivityUpdates - 250000 worstcase 41.285 ms/op 29.260 ms/op 1.41
phase0 processRegistryUpdates - 250000 normalcase 19.592 us/op 14.071 us/op 1.39
phase0 processRegistryUpdates - 250000 badcase_full_deposits 485.35 us/op 488.15 us/op 0.99
phase0 processRegistryUpdates - 250000 worstcase 0.5 188.60 ms/op 173.78 ms/op 1.09
altair processRewardsAndPenalties - 250000 normalcase 86.556 ms/op 63.537 ms/op 1.36
altair processRewardsAndPenalties - 250000 worstcase 85.240 ms/op 46.339 ms/op 1.84
phase0 getAttestationDeltas - 250000 normalcase 13.756 ms/op 8.8429 ms/op 1.56
phase0 getAttestationDeltas - 250000 worstcase 14.354 ms/op 12.863 ms/op 1.12
phase0 processSlashings - 250000 worstcase 136.50 us/op 138.36 us/op 0.99
altair processSyncCommitteeUpdates - 250000 180.73 ms/op 192.19 ms/op 0.94
BeaconState.hashTreeRoot - No change 299.00 ns/op 321.00 ns/op 0.93
BeaconState.hashTreeRoot - 1 full validator 201.07 us/op 178.86 us/op 1.12
BeaconState.hashTreeRoot - 32 full validator 1.9445 ms/op 2.3780 ms/op 0.82
BeaconState.hashTreeRoot - 512 full validator 21.447 ms/op 22.237 ms/op 0.96
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 216.37 us/op 219.61 us/op 0.99
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 3.0019 ms/op 2.7165 ms/op 1.11
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 43.900 ms/op 37.704 ms/op 1.16
BeaconState.hashTreeRoot - 1 balances 185.32 us/op 181.68 us/op 1.02
BeaconState.hashTreeRoot - 32 balances 1.7369 ms/op 1.6211 ms/op 1.07
BeaconState.hashTreeRoot - 512 balances 18.588 ms/op 18.226 ms/op 1.02
BeaconState.hashTreeRoot - 250000 balances 319.01 ms/op 267.26 ms/op 1.19
aggregationBits - 2048 els - zipIndexesInBitList 33.508 us/op 35.996 us/op 0.93
byteArrayEquals 32 83.230 ns/op 96.030 ns/op 0.87
Buffer.compare 32 60.280 ns/op 59.988 ns/op 1.00
byteArrayEquals 1024 3.0074 us/op 2.2495 us/op 1.34
Buffer.compare 1024 83.389 ns/op 76.409 ns/op 1.09
byteArrayEquals 16384 48.690 us/op 35.053 us/op 1.39
Buffer.compare 16384 337.40 ns/op 300.11 ns/op 1.12
byteArrayEquals 123687377 347.18 ms/op 293.76 ms/op 1.18
Buffer.compare 123687377 12.162 ms/op 12.497 ms/op 0.97
byteArrayEquals 32 - diff last byte 116.16 ns/op 93.576 ns/op 1.24
Buffer.compare 32 - diff last byte 66.667 ns/op 65.595 ns/op 1.02
byteArrayEquals 1024 - diff last byte 3.1692 us/op 2.5168 us/op 1.26
Buffer.compare 1024 - diff last byte 88.433 ns/op 84.889 ns/op 1.04
byteArrayEquals 16384 - diff last byte 50.364 us/op 36.875 us/op 1.37
Buffer.compare 16384 - diff last byte 409.24 ns/op 290.18 ns/op 1.41
byteArrayEquals 123687377 - diff last byte 263.84 ms/op 298.64 ms/op 0.88
Buffer.compare 123687377 - diff last byte 6.6264 ms/op 8.4752 ms/op 0.78
byteArrayEquals 32 - random bytes 5.4380 ns/op 7.0780 ns/op 0.77
Buffer.compare 32 - random bytes 62.513 ns/op 69.703 ns/op 0.90
byteArrayEquals 1024 - random bytes 5.5320 ns/op 6.3340 ns/op 0.87
Buffer.compare 1024 - random bytes 62.358 ns/op 66.205 ns/op 0.94
byteArrayEquals 16384 - random bytes 5.3740 ns/op 6.3840 ns/op 0.84
Buffer.compare 16384 - random bytes 60.587 ns/op 66.744 ns/op 0.91
byteArrayEquals 123687377 - random bytes 8.5800 ns/op 9.7400 ns/op 0.88
Buffer.compare 123687377 - random bytes 65.730 ns/op 78.030 ns/op 0.84
regular array get 100000 times 45.385 us/op 47.955 us/op 0.95
wrappedArray get 100000 times 47.250 us/op 47.620 us/op 0.99
arrayWithProxy get 100000 times 14.717 ms/op 14.973 ms/op 0.98
ssz.Root.equals 55.622 ns/op 59.232 ns/op 0.94
byteArrayEquals 59.168 ns/op 57.182 ns/op 1.03
Buffer.compare 11.397 ns/op 13.645 ns/op 0.84
shuffle list - 16384 els 7.3157 ms/op 7.5244 ms/op 0.97
shuffle list - 250000 els 102.55 ms/op 109.35 ms/op 0.94
processSlot - 1 slots 16.315 us/op 21.271 us/op 0.77
processSlot - 32 slots 2.9625 ms/op 3.8978 ms/op 0.76
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 61.770 ms/op 64.092 ms/op 0.96
getCommitteeAssignments - req 1 vs - 250000 vc 2.5564 ms/op 2.8730 ms/op 0.89
getCommitteeAssignments - req 100 vs - 250000 vc 3.7642 ms/op 4.1820 ms/op 0.90
getCommitteeAssignments - req 1000 vs - 250000 vc 4.4790 ms/op 4.5294 ms/op 0.99
findModifiedValidators - 10000 modified validators 604.39 ms/op 724.79 ms/op 0.83
findModifiedValidators - 1000 modified validators 468.55 ms/op 582.17 ms/op 0.80
findModifiedValidators - 100 modified validators 443.85 ms/op 641.92 ms/op 0.69
findModifiedValidators - 10 modified validators 429.09 ms/op 610.73 ms/op 0.70
findModifiedValidators - 1 modified validators 424.30 ms/op 604.92 ms/op 0.70
findModifiedValidators - no difference 610.94 ms/op 572.41 ms/op 1.07
compare ViewDUs 5.3141 s/op 5.5454 s/op 0.96
compare each validator Uint8Array 1.7339 s/op 2.2423 s/op 0.77
compare ViewDU to Uint8Array 1.3674 s/op 1.5191 s/op 0.90
migrate state 1000000 validators, 24 modified, 0 new 851.68 ms/op 1.0276 s/op 0.83
migrate state 1000000 validators, 1700 modified, 1000 new 1.2744 s/op 1.3630 s/op 0.93
migrate state 1000000 validators, 3400 modified, 2000 new 1.5709 s/op 1.7555 s/op 0.89
migrate state 1500000 validators, 24 modified, 0 new 907.10 ms/op 999.44 ms/op 0.91
migrate state 1500000 validators, 1700 modified, 1000 new 1.1868 s/op 1.4079 s/op 0.84
migrate state 1500000 validators, 3400 modified, 2000 new 1.2977 s/op 1.7825 s/op 0.73
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.3400 ns/op 6.4000 ns/op 0.68
state getBlockRootAtSlot - 250000 vs - 7PWei 972.77 ns/op 877.00 ns/op 1.11
computeProposers - vc 250000 9.7746 ms/op 11.522 ms/op 0.85
computeEpochShuffling - vc 250000 103.11 ms/op 116.55 ms/op 0.88
getNextSyncCommittee - vc 250000 158.53 ms/op 194.93 ms/op 0.81
computeSigningRoot for AttestationData 26.312 us/op 35.178 us/op 0.75
hash AttestationData serialized data then Buffer.toString(base64) 2.3588 us/op 2.5995 us/op 0.91
toHexString serialized data 1.0968 us/op 1.8692 us/op 0.59
Buffer.toString(base64) 210.98 ns/op 298.21 ns/op 0.71

by benchmarkbot/action

@twoeths
Copy link
Contributor Author

twoeths commented Dec 20, 2023

Can't we see the latency between vc-bn already if we look at the Req log from onRequest handler? Or do you suspect that the body is not fully streamed when this is called?

no, we can't see the latency between vc-bn right now. The getResponseTime is set at https://github.com/fastify/fastify/blob/bc8743a2f225c92ad68cf2ad11bcd6930249fb69/lib/reply.js#L458

we can see that kReplyStartTime is set even before onRequest hook https://github.com/fastify/fastify/blob/bc8743a2f225c92ad68cf2ad11bcd6930249fb69/lib/route.js#L503

Or do you suspect that the body is not fully streamed when this is called?

yes the body is not streamed when kReplyStartTime is set

Depends on what we wanna measure I think preHandler hook it correct but it does not add latency information but as I understand it would give us information how much time it took on the fastify side to do Parsing and Validation internally.

up until now the most suspected thing is we have gc or a heavy apis jump in the middle of publishBlockV2(). I want to make sure it's not the case (or my assumption is wrong) before we dig into more detail on which step may cause the issue by logging right before our own handler, i.e preHandler step

@nflaig
Copy link
Member

nflaig commented Dec 20, 2023

no, we can't see the latency between vc-bn right now.

But looking at the timestamp when the vc send the request and the Req timestamp on the beacon node should be pretty accurate to see the network latency between the vc <> bn. Otherwise I don't see any way how we could measure that, there is no hook before onRequest.

up until now the most suspected thing is we have gc or a heavy apis jump in the middle of publishBlockV2()

Yes, this definitely makes sense, just to want to make sure we are looking at the right data

In case of these two logs, the time between first and second log is how long it took fastify to parse (JSON.parse) the payload and validate it.

Dec-20 08:37:38.762[rest]            debug: Req req-bu 127.0.0.1 submitPoolAttestations
Dec-20 08:37:38.762[rest]            debug: Exec req-bu 127.0.0.1 submitPoolAttestations

Especially parsing step could be an issue as it's done the main thread which might be busy.

@twoeths twoeths merged commit bef0ef6 into unstable Dec 21, 2023
@twoeths twoeths deleted the tuyen/fastify_pre_handler branch December 21, 2023 01:16
twoeths added a commit that referenced this pull request Dec 22, 2023
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.14.0 🎉

ensi321 pushed a commit to ensi321/lodestar that referenced this pull request Jan 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants