Record empty responses when retrying a peer task #5509

fab-10 · 2023-05-29T10:35:31Z

PR description

AbstractRetryingPeerTask has a way to understand if the result of a try is empty, but it only uses this information to discriminate if the result is a partial one, instead is very useful to also use the emptiness information to demote the peer and eventually disconnect it in case it sends too many useless responses, as done in this PR.
In the making of this PR, I discovered that there were opportunities to improve the code and simplify the writing of retrying tasks, so I refactored and documented the code so that any class extending AbstractRetryingPeerTask should not set the final task result by themself, but instead implements the emptyResult and successfulResult to report the status of the request, so that the final setting of the task result is always a duty of AbstractRetryingPeerTask, removing the different approaches used before.

relates to #5415 and #5271

Tests

Checkpoint Sync
Snap Sync
Fast Sync
Full sync

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

github-actions · 2023-05-29T10:35:42Z

I thought about documentation and added the doc-change-required label to this PR if updates are required.
I have considered running ./gradlew acceptanceTestNonMainnet locally if my PR affects non-mainnet modules.
I thought about the changelog and included a changelog update if required.
If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

macfarla

nice refactor. one q on a change from getMaxPeers to getPeerLowerBound

macfarla · 2023-05-30T01:23:02Z

...n/java/org/hyperledger/besu/ethereum/eth/manager/task/AbstractRetryingSwitchingPeerTask.java

@@ -136,7 +125,7 @@ private void refreshPeers() {
    // If we are at max connections, then refresh peers disconnecting one of the failed peers,
    // or the least useful

-    if (peers.peerCount() >= peers.getMaxPeers()) {
+    if (peers.peerCount() >= peers.getPeerLowerBound()) {


should this be upperBound?

I want to force Besu to actively search for new peers, and my understanding is that this happens when the number of peers is below the lower bound

maybe add a comment (or edit the existing comment) because the code is now doing something slightly different

matkt · 2023-05-30T06:56:50Z

...n/java/org/hyperledger/besu/ethereum/eth/sync/tasks/RetryingGetHeaderFromPeerByHashTask.java

@@ -90,24 +89,21 @@ protected CompletableFuture<List<BlockHeader>> executeTaskOnCurrentPeer(final Et
                  referenceHash,
                  peer,
                  peerResult.getResult());
-              if (peerResult.getResult().isEmpty()) {


Will this change the behavior of the code that could listen to the exception?

this exception was intercepted by AbstractRetryingSwitchingPeerTask as a retryable error, but since now the emptiness check is centralized in AbstractRetryingPeerTask there is not more need for this custom handling here

matkt · 2023-05-30T07:07:48Z

...h/src/main/java/org/hyperledger/besu/ethereum/eth/manager/task/AbstractRetryingPeerTask.java

-                if (!isEmptyResponse.test(peerResult)) {
-                  retryCount = 0;
+                if (successfulResult(peerResult)) {
+                  result.complete(peerResult);


I wonder how it worked before without this success part. Is this something that is done automatically and that we don't really need?

Before each extending task had to set the final result by itself, using result.complete, and another thing that was a bit confusing to me, is that the returned value of a task is not the final result until you set result.complete and so I tried to centralize this part here and make the writing of extending tasks easier.

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

matkt

LGTM. Waiting for the test results

siladu · 2023-06-01T07:42:23Z

SGTM. Do we demote the peer as soon as we get one empty response from it? If so, are there not cases where we receive an empty result but still want to retry with the same peer?

garyschulte · 2023-12-07T17:39:56Z

...java/org/hyperledger/besu/ethereum/eth/manager/snap/RetryingGetAccountRangeFromPeerTask.java

+
+  @Override
+  protected boolean emptyResult(final AccountRangeMessage.AccountRangeData data) {
+    return data.accounts().isEmpty() && data.proofs().isEmpty();


An empty response for SnapProtocol/v1 is not necessarily a reason to demote a peer. If we are requesting a range that is outside of the 128 block snap range, an empty response is in-protocol. We should add additional criteria to only demote peers that give empty range that is withing 128 blocks of head.

Otherwise we might end up with snap sync performance regression by dropping peers from which we ask for old ranges (while we have an old pivot block)

garyschulte · 2023-12-07T17:40:41Z

...java/org/hyperledger/besu/ethereum/eth/manager/snap/RetryingGetStorageRangeFromPeerTask.java

+
+  @Override
+  protected boolean emptyResult(final StorageRangeMessage.SlotRangeData peerResult) {
+    return peerResult.proofs().isEmpty() && peerResult.slots().isEmpty();


same here, an empty response might be a signal that we are asking for old ranges.

macfarla · 2024-04-11T02:45:50Z

Blocked by #6609 - without that we would disconnect peers, because we are trying peers that do not serve snap data.

fab-10 · 2024-09-11T08:33:25Z

@Matilda-Clerke his could be superseded by #7602 too

Matilda-Clerke · 2024-09-11T23:15:26Z

@fab-10 I haven't included any feedback regarding peer performance or quality in the refactor yet, so it should be possible to ensure we provide feedback for empty responses too. I'll take a good look at this PR to make sure I understand the intended behaviour.

Record empty responses when retrying a peer task

1378afc

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

fab-10 mentioned this pull request May 29, 2023

Use retry switching peer for world state download tasks #5508

Closed

fab-10 force-pushed the demote-peer-with-empty-responses-when-retrying branch from da95d50 to 063c129 Compare May 29, 2023 14:28

Review and consolidate the way result is set in retrying tasks

a20c5ed

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

fab-10 force-pushed the demote-peer-with-empty-responses-when-retrying branch from 063c129 to a20c5ed Compare May 29, 2023 16:16

fab-10 marked this pull request as ready for review May 29, 2023 16:38

macfarla reviewed May 30, 2023

View reviewed changes

matkt reviewed May 30, 2023

View reviewed changes

fab-10 added 2 commits May 30, 2023 11:04

Merge branch 'main' into demote-peer-with-empty-responses-when-retrying

82051cb

Fix: always use the root cause to check the errors

c695c32

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

fab-10 mentioned this pull request May 30, 2023

Make all tasks switching tasks #5271

Closed

matkt reviewed May 30, 2023

View reviewed changes

Merge branch 'main' into demote-peer-with-empty-responses-when-retrying

ea4a469

fab-10 marked this pull request as draft June 5, 2023 10:45

fab-10 mentioned this pull request Dec 5, 2023

consider peer reputation score when deciding to disconnect #6187

Merged

garyschulte reviewed Dec 7, 2023

View reviewed changes

macfarla assigned pinges Sep 2, 2024

jframe closed this Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record empty responses when retrying a peer task #5509

Record empty responses when retrying a peer task #5509

fab-10 commented May 29, 2023 •

edited

Loading

github-actions bot commented May 29, 2023 •

edited by fab-10

Loading

macfarla left a comment

macfarla May 30, 2023

fab-10 May 30, 2023

macfarla Jun 1, 2023

matkt May 30, 2023

fab-10 May 30, 2023

matkt May 30, 2023

fab-10 May 30, 2023

matkt May 30, 2023

matkt left a comment

siladu commented Jun 1, 2023

garyschulte Dec 7, 2023

garyschulte Dec 7, 2023

macfarla commented Apr 11, 2024

fab-10 commented Sep 11, 2024

Matilda-Clerke commented Sep 11, 2024

Record empty responses when retrying a peer task #5509

Record empty responses when retrying a peer task #5509

Conversation

fab-10 commented May 29, 2023 • edited Loading

PR description

Tests

github-actions bot commented May 29, 2023 • edited by fab-10 Loading

macfarla left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matkt left a comment

Choose a reason for hiding this comment

siladu commented Jun 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

macfarla commented Apr 11, 2024

fab-10 commented Sep 11, 2024

Matilda-Clerke commented Sep 11, 2024

fab-10 commented May 29, 2023 •

edited

Loading

github-actions bot commented May 29, 2023 •

edited by fab-10

Loading