Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chunks] Move chunk completion out of ShardsManager #7622

Merged
merged 1 commit into from
Sep 24, 2022

Conversation

robin-near
Copy link
Contributor

@robin-near robin-near commented Sep 16, 2022

  • Move handling of chunk completion to a ClientActor message (ShardsManagerResponse), because this will eventually be async. So no more passing DoneApplyingChunksCallback. Add support of this message to TestEnv; tests that rely on chunk part messaging now need to manually handle such messages (because the TestEnv doesn't have real actors). Fixed tests that need this.
  • Move ReedSolomonWrapper inside ShardsManager. However Client still has a copy because it actually needs to encode the parts to produce a chunk - at least the merkle root is needed.
  • Does not touch ChainStore yet. Leaving that till later.
  • Minor refactoring to push chunk part processing logic from Client into ShardsManager (otherwise Client needs to fetch header and stuff).

@robin-near robin-near changed the title Draft for moving stuff out of ShardsManager [chunks] Move chunk completion out of ShardsManager Sep 19, 2022
@robin-near robin-near marked this pull request as ready for review September 19, 2022 22:36
@robin-near robin-near requested a review from a team as a code owner September 19, 2022 22:36
Copy link
Contributor

@mzhangmzz mzhangmzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! I have two comments:

  1. It might be better to achieve the callback of on_chunk_complete not through actor messages. My concern of changing that to actor messages is that it shares the same queue as all other messages that ClientActor receives and if the queue is long, the callback is delayed. You can still have a ClientActorAdapterForShardsManager trait which has on_chunk_complete function and let client actor implement that function.
  2. Maybe we should send the whole chunk instead of only chunk header to client and move the logic of saving chunks from ShardsManager to client.

prev_block_hash: &CryptoHash,
apply_chunks_done_callback: DoneApplyChunkCallback,
) {
pub fn check_incomplete_chunks(&mut self, prev_block_hash: &CryptoHash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move this function to ShardsManger too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just don't wanna do too much in this PR yet. So... kinda pausing at an arbitrary point.

@@ -1760,6 +1812,23 @@ impl ShardsManager {
Ok(result)
}

pub fn process_partial_encoded_chunk_response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is not used right now

@mzhangmzz mzhangmzz requested a review from matklad September 20, 2022 21:12
@mzhangmzz
Copy link
Contributor

Adding @matklad as a reviewer too

@robin-near
Copy link
Contributor Author

Thanks! How would we invoke on_chunk_complete not through actor messages? We need a &mut Client for that call...

@matklad
Copy link
Contributor

matklad commented Sep 21, 2022

Do we have some broader context here? What is the end-goal of the refactor and such?

ChunkCompleted(ShardChunkHeader),
}

pub trait ClientAdapterForShardsManager: MsgRecipient<ShardsManagerResponse> {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maybe hide actix from the API here?

diff --git a/chain/chunks/src/client.rs b/chain/chunks/src/client.rs
index ed2b2ba58..e207c0cd1 100644
--- a/chain/chunks/src/client.rs
+++ b/chain/chunks/src/client.rs
@@ -2,12 +2,18 @@ use actix::Message;
 use near_network::types::MsgRecipient;
 use near_primitives::sharding::ShardChunkHeader;
 
+pub trait ClientAdapterForShardsManager {
+    fn did_complete_chunk(&self, chunk_header: ShardChunkHeader);
+}
+
 #[derive(Message)]
 #[rtype(result = "()")]
 pub enum ShardsManagerResponse {
     ChunkCompleted(ShardChunkHeader),
 }
 
-pub trait ClientAdapterForShardsManager: MsgRecipient<ShardsManagerResponse> {}
-
-impl<A: MsgRecipient<ShardsManagerResponse>> ClientAdapterForShardsManager for A {}
+impl<A: MsgRecipient<ShardsManagerResponse>> ClientAdapterForShardsManager for A {
+    fn did_complete_chunk(&self, chunk_header: ShardChunkHeader) {
+        self.do_send(ShardsManagerResponse::ChunkCompleted(chunk_header))
+    }
+}
diff --git a/chain/chunks/src/lib.rs b/chain/chunks/src/lib.rs
index 990d7cbee..b376f353d 100644
--- a/chain/chunks/src/lib.rs
+++ b/chain/chunks/src/lib.rs
@@ -85,7 +85,7 @@ use std::sync::Arc;
 use std::time::{Duration, Instant};
 
 use chrono::DateTime;
-use client::{ClientAdapterForShardsManager, ShardsManagerResponse};
+use client::ClientAdapterForShardsManager;
 use near_primitives::time::Utc;
 use rand::seq::IteratorRandom;
 use rand::seq::SliceRandom;
@@ -1967,7 +1967,7 @@ impl ShardsManager {
         self.encoded_chunks.remove_from_cache_if_outside_horizon(chunk_hash);
         self.requested_partial_encoded_chunks.remove(chunk_hash);
         debug!(target: "chunks", "Completed chunk {:?}", chunk_hash);
-        self.client_adapter.do_send(ShardsManagerResponse::ChunkCompleted(header));
+        self.client_adapter.did_complete_chunk(header);
     }
 
     /// Send the parts of the partial_encoded_chunk that are owned by `self.me` to the

In generally we want to move away from it, so better if stuff doesn't call actix APIs directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @matklad; @mzhangmzz, is this the same thing you were talking about? It doesn't solve the fact that this will go on the same client actor queue though. I don't see how to avoid that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do this for the next PR; I need to change the API anyway.

@mzhangmzz
Copy link
Contributor

Do we have some broader context here? What is the end-goal of the refactor and such?

@matklad The end goal is to move ShardsManager to its own thread(actor)

@matklad
Copy link
Contributor

matklad commented Sep 21, 2022

Is there some issue on that, so that we can mention it in the RP description, to make sure it gets surfaced in git blame in the future?

@mzhangmzz
Copy link
Contributor

Is there some issue on that, so that we can mention it in the RP description, to make sure it gets surfaced in git blame in the future?

Good point. @robin-near do you mind creating a github and a jira issue to track this project?

@mzhangmzz
Copy link
Contributor

Could you also run nayduck tests? Since this PR touches quite a few things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move ShardsManager chunk management logic to a separate Actix thread
3 participants