Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(katana): dedup fork request #2001

Merged
merged 10 commits into from
Dec 19, 2024
Merged

Conversation

cwkang1998
Copy link
Contributor

@cwkang1998 cwkang1998 commented May 25, 2024

Description

Implements a request deduplication process via the use of a HashSet on the requests. This also requires adding a key for the pending_requests vector so we can add any pending requests that was popped back to the queue.

The key for the requests deduplication was originally implemented as a String, but changed to Enum after to enforce the possible types.

Considered IndexMap to reduce need to add an extra data structure, but since that requires adding a package and this should achieve the same result, kept this implementation.

Related issue

Closes #1465.

Tests

  • Yes
  • No, because they aren't needed
  • No, because I need help

The tests for the changes I've made passed. However some seemingly unrelated tests are failing and I am not sure what's the exact issue. Hopefully its a setup issue and its only failing on my machine.

Added to documentation?

  • README.md
  • Dojo Book
  • No documentation needed

Not sure docs is needed for this on the dojo book, but if needed I can add to it. For now only comments are added for explanation.

Checklist

  • I've formatted my code (scripts/prettier.sh, scripts/rust_fmt.sh, scripts/cairo_fmt.sh)
  • I've linted my code (scripts/clippy.sh, scripts/docs.sh)
  • I've commented my code
  • I've requested a review after addressing the comments

Summary by CodeRabbit

  • New Features

    • Introduced a unified response structure for backend requests, enhancing request handling.
    • Implemented a deduplication mechanism for concurrent requests to improve efficiency.
  • Bug Fixes

    • Improved error management with enhanced error handling strategies.
  • Tests

    • Expanded tests to verify the deduplication functionality for request handling.

Copy link

codecov bot commented May 25, 2024

Codecov Report

Attention: Patch coverage is 96.35535% with 16 lines in your changes missing coverage. Please review.

Project coverage is 55.91%. Comparing base (4671d9c) to head (ec163c5).
Report is 105 commits behind head on main.

Files with missing lines Patch % Lines
...ana/storage/provider/src/providers/fork/backend.rs 96.35% 16 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2001      +/-   ##
==========================================
- Coverage   57.41%   55.91%   -1.50%     
==========================================
  Files         406      439      +33     
  Lines       51330    56147    +4817     
==========================================
+ Hits        29470    31395    +1925     
- Misses      21860    24752    +2892     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kariy kariy self-assigned this May 26, 2024
@glihm glihm added the katana This issue is related to Katana label May 27, 2024
@kariy kariy marked this pull request as draft May 28, 2024 13:43
@glihm glihm modified the milestone: 1.0.0 Jun 19, 2024
@glihm
Copy link
Collaborator

glihm commented Jul 9, 2024

@cwkang1998 sorry for the delay on this one. A lot's of changes have happen since. @kariy any recommendation around this issue and how it should be tackled?

@cwkang1998
Copy link
Contributor Author

Hey, sorry for being so slow on this as well! I think I am still missing some implementation details for this so I will get an update to this PR within this week, but if there's any changes I will have to make please do tell me and I will try my best to implement it asap.

@kariy
Copy link
Member

kariy commented Jul 10, 2024

Hey, sorry for being so slow on this as well! I think I am still missing some implementation details for this so I will get an update to this PR within this week, but if there's any changes I will have to make please do tell me and I will try my best to implement it asap.

nope, nothing changed from what we discussed before.

this PR is merely for optimization purpose, and is very isolated from changes made on other part of the codebase and isn't tight to the system's architecture itself.

so the current changes made are already good, we're only missing the part where we wait for the response of the initial request and reuse it for the deduped ones.

@kariy
Copy link
Member

kariy commented Nov 19, 2024

@cwkang1998 no pressure but do you think you can continue this PR? otherwise i can take over it as necessary.

@cwkang1998 cwkang1998 marked this pull request as ready for review December 1, 2024 00:22
@cwkang1998
Copy link
Contributor Author

@cwkang1998 no pressure but do you think you can continue this PR? otherwise i can take over it as necessary.

Just added the changes required to make the result return as expected. Sorry for such a long delay.

Will attempt to add more tests to this.

Copy link

coderabbitai bot commented Dec 1, 2024

Walkthrough

Ohayo, sensei! The pull request introduces significant enhancements to the backend.rs file within the Katana project. Key modifications include the creation of a BackendResponse enum for unified response handling, updated error management using Arc, and the implementation of a deduplication mechanism for incoming requests. The request handling logic has been refactored to prevent duplicate processing of identical requests, ensuring efficient use of resources. Method signatures have been updated to reflect these changes, and tests have been expanded to verify the new deduplication functionality.

Changes

File Path Change Summary
crates/katana/storage/provider/src/providers/fork/backend.rs - Added BackendResponse enum.
- Updated BackendError to include UnexpectedReceiveResult(Arc<anyhow::Error>).
- Refactored request methods to return BackendResponse.
- Implemented request deduplication logic.
- Updated type alias for BackendRequestFuture.

Assessment against linked issues

Objective Addressed Explanation
Implement request deduplication ( #1465 )

Possibly related PRs


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (5)
crates/katana/storage/provider/src/providers/fork/backend.rs (5)

138-140: Effective use of data structures for request management

The addition of request_dedup_map and pending_requests to the Backend struct appropriately manages deduplicated requests and pending futures. Ensure that these structures are managed efficiently to handle high loads without performance degradation.

Consider monitoring the size of these collections and implementing cleanup strategies if necessary to prevent potential memory bloat.


203-306: Sensei, refactor duplicated code in handle_requests to improve maintainability

The deduplication logic in the handle_requests method is repeated for each request type, leading to code duplication. This can make future maintenance and updates more error-prone.

Consider abstracting the common deduplication logic into a helper function or macro that can be reused for all request types. This will reduce code repetition and simplify future modifications.


357-370: Ensure efficient cloning of BackendResponse to optimize performance

Cloning BackendResponse for each waiting sender may introduce performance overhead, especially if the response contains large data.

Consider wrapping the BackendResponse in an Arc to allow for cheap cloning and shared ownership:

-    sender.send(res.clone()).expect(
+    let shared_res = Arc::new(res);
+    sender.send(Arc::clone(&shared_res)).expect(

This change reduces the cost of cloning responses and improves performance when dealing with multiple senders.


363-366: Handle potential errors when sending responses to prevent panics

Using .expect when sending responses can cause the application to panic if a sender is dropped or encounters an error.

Modify the code to handle send errors gracefully:

-    sender.send(res.clone()).expect(
+    if let Err(e) = sender.send(res.clone()) {
+        error!(target: LOG_TARGET, "Failed to send response for {:?}: {:?}", fut_key, e);
+    }

This logs the error without panicking, ensuring the backend remains robust even if some receivers have disconnected.


842-874: Improve test reliability by synchronizing threads instead of using thread::sleep

In the test get_nonce_request_should_be_deduplicated, thread::sleep(Duration::from_secs(1)) is used to wait for requests to complete. This approach can lead to flaky tests due to timing issues.

Consider using synchronization primitives like channels or Arc<AtomicUsize> counters to synchronize the completion of threads:

let counter = Arc::new(AtomicUsize::new(0));
let h1 = handle.clone();
let c1 = counter.clone();
thread::spawn(move || {
    h1.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
    c1.fetch_add(1, Ordering::SeqCst);
});
// Repeat for other threads...

// Wait until counter reaches the expected value
while counter.load(Ordering::SeqCst) < EXPECTED_COUNT {
    std::thread::yield_now();
}

This ensures the test waits for all spawned threads to complete without relying on arbitrary sleep durations.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4671d9c and 2631344.

📒 Files selected for processing (1)
  • crates/katana/storage/provider/src/providers/fork/backend.rs (19 hunks)
🔇 Additional comments (5)
crates/katana/storage/provider/src/providers/fork/backend.rs (5)

38-45: Ohayo, sensei! Well-structured BackendResponse enum enhances response handling

The introduction of the BackendResponse enum provides a unified structure for handling various backend responses. This improves code maintainability and clarity by standardizing the response types.


54-60: Consistency in error handling using Arc<anyhow::Error>

Wrapping errors with Arc<anyhow::Error> in the BackendError enum ensures thread-safe error propagation across asynchronous contexts. This is crucial for concurrent environments to prevent data races and maintain error integrity.


Line range hint 84-105: Updated method signatures align with unified response handling

The changes in the method signatures of BackendRequest variants to return (BackendRequest, OneshotReceiver<BackendResponse>) align with the new unified BackendResponse enum. This streamlines response handling and reduces code complexity.


117-127: Introduction of BackendRequestIdentifier for efficient deduplication

Using the BackendRequestIdentifier enum to uniquely identify requests enables efficient deduplication of identical requests. This optimization can significantly reduce redundant network calls and improve overall performance.


703-708: Comprehensive handling of provider errors in handle_not_found_err

The handle_not_found_err function converts specific StarknetError variants to Option::None, following the provider API conventions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)
crates/katana/storage/provider/src/providers/fork/backend.rs (3)

47-60: Consider adding more context to error messages

While the error handling is generally good, some error messages could be more descriptive to aid in debugging:

 #[derive(Debug, thiserror::Error, Clone)]
 pub enum BackendError {
-    #[error("failed to send request to backend: {0}")]
+    #[error("failed to send request to backend (channel might be full or closed): {0}")]
     FailedSendRequest(#[from] SendError),
-    #[error("failed to receive result from backend: {0}")]
+    #[error("failed to receive result from backend (sender might have been dropped): {0}")]
     FailedReceiveResult(#[from] RecvError),

369-388: Improve error handling for sender failures

The current implementation uses unwrap_or_else with error logging for sender failures. While this works, we could improve the robustness by:

  1. Tracking failed sends
  2. Cleaning up resources for failed senders

Consider refactoring the response handling:

-sender_vec.iter().for_each(|sender| {
-    sender.send(res.clone()).unwrap_or_else(|_| error!(target: LOG_TARGET, "failed to send result of request {:?} to sender {:?}", fut_key, sender));
-});
+let (successful_senders, failed_senders): (Vec<_>, Vec<_>) = sender_vec
+    .iter()
+    .partition(|sender| sender.send(res.clone()).is_ok());
+
+if !failed_senders.is_empty() {
+    error!(
+        target: LOG_TARGET,
+        "Failed to send results to {} senders for request {:?}",
+        failed_senders.len(),
+        fut_key
+    );
+}

Line range hint 878-1209: Ohayo! Great test coverage with some potential improvements

The test suite is comprehensive and well-structured. A few suggestions to make it even better:

  1. Add test cases for concurrent request failures
  2. Consider parameterized tests to reduce code duplication
  3. Add timing assertions to verify deduplication performance

Example of a parameterized test:

#[test]
fn test_request_deduplication() {
    let test_cases = vec![
        ("nonce", vec![
            Box::new(|h: &BackendHandle| h.get_nonce(felt!("0x1").into())),
            Box::new(|h: &BackendHandle| h.get_nonce(felt!("0x1").into())),
            Box::new(|h: &BackendHandle| h.get_nonce(felt!("0x2").into())),
        ], 2),
        ("class", vec![
            Box::new(|h: &BackendHandle| h.get_class_at(felt!("0x1"))),
            Box::new(|h: &BackendHandle| h.get_class_at(felt!("0x1"))),
            Box::new(|h: &BackendHandle| h.get_class_at(felt!("0x2"))),
        ], 2),
        // Add more cases
    ];

    for (name, requests, expected_count) in test_cases {
        // Your existing test logic here
    }
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2631344 and 471abb8.

📒 Files selected for processing (1)
  • crates/katana/storage/provider/src/providers/fork/backend.rs (20 hunks)
🔇 Additional comments (2)
crates/katana/storage/provider/src/providers/fork/backend.rs (2)

38-45: Ohayo, sensei! The BackendResponse enum looks good!

The enum provides a clean and type-safe way to handle different response types. The implementation is well-structured and follows Rust best practices.


119-127: Ohayo! Well-designed identifier enum for deduplication

The BackendRequestIdentifier enum is well-structured for request deduplication:

  • Derives appropriate traits for HashMap key usage
  • Covers all request types
  • Uses efficient memory layout with Copy

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
crates/katana/storage/provider/src/providers/fork/backend.rs (2)

339-347: Enhance error handling for response distribution

The error handling when sending responses to multiple senders could be improved. Currently, errors are just logged, which might hide issues.

Consider collecting failed sends and handling them more explicitly:

-                        sender_vec.iter().for_each(|sender| {
-                            sender.send(res.clone()).unwrap_or_else(|_| error!(target: LOG_TARGET, "failed to send result of request {:?} to sender {:?}", fut_key, sender));
-                        });
+                        let failed_sends: Vec<_> = sender_vec.iter()
+                            .filter_map(|sender| {
+                                sender.send(res.clone())
+                                    .err()
+                                    .map(|_| sender)
+                            })
+                            .collect();
+                        
+                        if !failed_sends.is_empty() {
+                            error!(
+                                target: LOG_TARGET,
+                                "Failed to send results to {} senders for request {:?}",
+                                failed_sends.len(),
+                                fut_key
+                            );
+                        }

736-774: Add more detailed documentation for mock RPC server

While the implementation is solid, the documentation could be enhanced to better explain:

  • The purpose of the response timing control
  • The format requirements for the response string
  • Usage examples

Consider adding documentation like this:

/// Helper function to start a TCP server that returns predefined JSON-RPC responses.
/// 
/// # Arguments
/// 
/// * `addr` - The address to bind the server to (e.g., "127.0.0.1:8080")
/// * `response` - A JSON-RPC response string that will be returned for all requests
/// 
/// # Returns
/// 
/// A `SyncSender` that can be used to control when the server sends its response.
/// This allows for testing race conditions and request deduplication timing.
/// 
/// # Example
/// 
/// ```
/// let response = r#"{"jsonrpc":"2.0","result":"0x123","id":1}"#;
/// let sender = start_mock_rpc_server("127.0.0.1:8090".to_string(), response.to_string());
/// 
/// // Wait for client requests to arrive
/// thread::sleep(Duration::from_secs(1));
/// 
/// // Signal server to send response
/// sender.send(()).unwrap();
/// ```
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 471abb8 and fb4b08d.

📒 Files selected for processing (1)
  • crates/katana/storage/provider/src/providers/fork/backend.rs (21 hunks)
🔇 Additional comments (5)
crates/katana/storage/provider/src/providers/fork/backend.rs (5)

38-45: Ohayo, sensei! Well-structured response type implementation

The BackendResponse enum effectively standardizes the response types across different backend requests, making the code more maintainable and type-safe.


54-60: Good use of Arc for error handling

Using Arc for error types allows for efficient shared ownership of error instances, which is particularly important in a concurrent context.


119-127: Well-designed request deduplication identifier

The BackendRequestIdentifier enum provides a clean and type-safe way to identify and deduplicate requests. The derive macros for Eq, Hash, and PartialEq are appropriately used for HashMap keys.


Line range hint 840-1171: Ohayo, sensei! Excellent test coverage for deduplication

The test suite thoroughly covers various deduplication scenarios:

  • Basic request deduplication
  • Cross-function deduplication
  • Same address with different keys
  • Response consistency verification

The use of mock servers and controlled response timing is particularly well done.


272-297: 🛠️ Refactor suggestion

Consider enhancing error handling in deduplication logic

The error handling in the dedup_request function could be improved. Currently, it logs an error and does nothing when the sender vector is not found, which should never happen according to the comment.

Consider either:

  1. Using expect since this is a programming error that should never happen
  2. Or returning a Result to properly handle the error case
-            match self.request_dedup_map.get_mut(&req_key) {
-                Some(sender_vec) => {
-                    sender_vec.push(sender);
-                }
-                None => {
-                    // Log this and do nothing here, as this should never happen.
-                    // If this does happen it is an unexpected bug.
-                    error!(target: LOG_TARGET, "failed to get current request dedup vector");
-                }
-            }
+            // This unwrap is safe because we just checked the entry exists
+            self.request_dedup_map.get_mut(&req_key)
+                .expect("request dedup vector disappeared")
+                .push(sender);

Likely invalid or redundant comment.

Copy link
Member

@kariy kariy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cwkang1998. Awesome work on this! Nice that you write tests for all the request types. I just have some few minor comments, but overall the implementation looks fine.

Comment on lines 276 to 278
rpc_call_future: F,
) where
F: FnOnce() -> BoxFuture<'static, BackendResponse>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why this need to be a FnOnce and not just BoxFuture<'static, BackendResponse> ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, not really, going to change it to just a BoxFuture.

) where
F: FnOnce() -> BoxFuture<'static, BackendResponse>,
{
if let std::collections::hash_map::Entry::Vacant(e) = self.request_dedup_map.entry(req_key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there shouldn't be a conflict so prefer to do it this way

Suggested change
if let std::collections::hash_map::Entry::Vacant(e) = self.request_dedup_map.entry(req_key)
if let Entry::Vacant(e) = self.request_dedup_map.entry(req_key)

Comment on lines 345 to 347
sender_vec.iter().for_each(|sender| {
sender.send(res.clone()).unwrap_or_else(|_| error!(target: LOG_TARGET, "failed to send result of request {:?} to sender {:?}", fut_key, sender));
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use structured fields

Suggested change
sender_vec.iter().for_each(|sender| {
sender.send(res.clone()).unwrap_or_else(|_| error!(target: LOG_TARGET, "failed to send result of request {:?} to sender {:?}", fut_key, sender));
});
sender.send(res.clone()).unwrap_or_else(|error| {
error!(target: LOG_TARGET, key = ?fut_key, %error, "Failed to send result.")
});

Comment on lines 851 to 872
// send requests to the backend
let h1 = handle.clone();
thread::spawn(move || {
h1.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
let h2 = handle.clone();
thread::spawn(move || {
h2.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
// Different request, should be counted
let h3 = handle.clone();
thread::spawn(move || {
h3.get_nonce(felt!("0x2").into()).expect(ERROR_SEND_REQUEST);
});

// wait for the requests to be handled
thread::sleep(Duration::from_secs(1));

// check request are handled
let stats = handle.stats().expect(ERROR_STATS);
assert_eq!(stats, 2, "Backend should only have 2 ongoing requests.")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a check before sending the third request to make sure the requests actually get dedup'd. Having it only at the end can be a bit vague because it doesn't indicate whether the 2 ongoing ones are exactly what we're expecting.

Suggested change
// send requests to the backend
let h1 = handle.clone();
thread::spawn(move || {
h1.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
let h2 = handle.clone();
thread::spawn(move || {
h2.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
// Different request, should be counted
let h3 = handle.clone();
thread::spawn(move || {
h3.get_nonce(felt!("0x2").into()).expect(ERROR_SEND_REQUEST);
});
// wait for the requests to be handled
thread::sleep(Duration::from_secs(1));
// check request are handled
let stats = handle.stats().expect(ERROR_STATS);
assert_eq!(stats, 2, "Backend should only have 2 ongoing requests.")
}
// send requests to the backend
let h1 = handle.clone();
thread::spawn(move || {
h1.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
let h2 = handle.clone();
thread::spawn(move || {
h2.get_nonce(felt!("0x1").into()).expect(ERROR_SEND_REQUEST);
});
// wait for the requests to be handled
thread::sleep(Duration::from_secs(1));
// check request are handled
let stats = handle.stats().expect(ERROR_STATS);
assert_eq!(stats, 1, "Backend should only have 1 ongoing requests.")
// Different request, should be counted
let h3 = handle.clone();
thread::spawn(move || {
h3.get_nonce(felt!("0x2").into()).expect(ERROR_SEND_REQUEST);
});
// wait for the requests to be handled
thread::sleep(Duration::from_secs(1));
// check request are handled
let stats = handle.stats().expect(ERROR_STATS);
assert_eq!(stats, 2, "Backend should only have 2 ongoing requests.")
}

@kariy
Copy link
Member

kariy commented Dec 15, 2024

Please run cargo +nightly-2024-08-28 fmt --all and ./scripts/clippy.sh

@cwkang1998 cwkang1998 requested a review from kariy December 18, 2024 19:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
crates/katana/storage/provider/src/providers/fork/backend.rs (3)

120-128: Consider adding more detailed documentation for BackendRequestIdentifier

The enum implementation is solid, but it would benefit from documentation explaining how each variant is used in the deduplication process.

Add documentation like this:

 // Identifier for pending requests.
 // This is used for request deduplication.
+/// Enum representing unique identifiers for different types of backend requests.
+/// Used as keys in the deduplication HashMap to track and combine duplicate requests.
 #[derive(Eq, Hash, PartialEq, Clone, Copy, Debug)]
 enum BackendRequestIdentifier {
+    /// Identifies a nonce request for a specific contract address
     Nonce(ContractAddress),
+    /// Identifies a class request for a specific class hash
     Class(ClassHash),
     // ... similar for other variants
 }

844-1131: Enhance test assertions for better clarity

The deduplication tests are comprehensive, but some assertions could be more descriptive.

Consider improving the assertion messages:

-        assert_eq!(stats, 1, "Backend should have 1 ongoing requests.");
+        assert_eq!(
+            stats,
+            1,
+            "Backend should have exactly 1 ongoing request after deduplicating identical requests"
+        );

Also, consider adding test cases for:

  1. Concurrent requests with different timing
  2. Error cases in deduplication
  3. Edge cases with request cancellation

743-778: Add error handling for mock RPC server

While the implementation works well for testing, it could benefit from more robust error handling.

Consider these improvements:

     pub fn start_mock_rpc_server(addr: String, response: String) -> SyncSender<()> {
         use tokio::runtime::Builder;
         let (tx, rx) = sync_channel::<()>(1);
 
         thread::spawn(move || {
             Builder::new_current_thread().enable_all().build().unwrap().block_on(async move {
-                let listener = TcpListener::bind(addr).await.unwrap();
+                let listener = TcpListener::bind(&addr).await.expect("Failed to bind mock server");
 
                 loop {
-                    let (mut socket, _) = listener.accept().await.unwrap();
+                    let (mut socket, _) = match listener.accept().await {
+                        Ok(socket) => socket,
+                        Err(e) => {
+                            error!(target: LOG_TARGET, %addr, %e, "Mock server accept failed");
+                            continue;
+                        }
+                    };
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb4b08d and d6da703.

📒 Files selected for processing (1)
  • crates/katana/storage/provider/src/providers/fork/backend.rs (22 hunks)
🔇 Additional comments (4)
crates/katana/storage/provider/src/providers/fork/backend.rs (4)

39-46: Ohayo, sensei! The BackendResponse and BackendError changes look good!

The new BackendResponse enum provides a clean abstraction for different response types, and the Arc wrapping in BackendError enables efficient thread-safe error sharing.

Also applies to: 55-61


336-357: Ohayo, sensei! The response handling implementation is well done!

The code effectively manages multiple senders, uses structured logging for errors, and properly cleans up the deduplication map after processing.


1164-1209: Ohayo, sensei! Excellent test for result consistency!

The test effectively verifies that all deduplicated requests receive the same response, using proper synchronization and clear assertions.


690-695: Clean error handling implementation!

The function effectively converts provider-specific errors into the expected None responses, following Katana's provider API conventions.

Comment on lines +281 to +302
fn dedup_request(
&mut self,
req_key: BackendRequestIdentifier,
sender: OneshotSender<BackendResponse>,
rpc_call_future: BoxFuture<'static, BackendResponse>,
) {
if let Entry::Vacant(e) = self.request_dedup_map.entry(req_key) {
self.pending_requests.push((req_key, rpc_call_future));
e.insert(vec![sender]);
} else {
match self.request_dedup_map.get_mut(&req_key) {
Some(sender_vec) => {
sender_vec.push(sender);
}
None => {
// Log this and do nothing here, as this should never happen.
// If this does happen it is an unexpected bug.
error!(target: LOG_TARGET, "failed to get current request dedup vector");
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ohayo, sensei! Improve error handling in dedup_request

The deduplication logic is solid, but the error handling in the None case could be more robust.

Consider this improvement:

-                None => {
-                    // Log this and do nothing here, as this should never happen.
-                    // If this does happen it is an unexpected bug.
-                    error!(target: LOG_TARGET, "failed to get current request dedup vector");
-                }
+                None => {
+                    // This should never happen as we just checked the key exists
+                    error!(
+                        target: LOG_TARGET,
+                        key = ?req_key,
+                        "Bug: Failed to get request dedup vector for existing key"
+                    );
+                    // Recover by treating it as a new request
+                    self.pending_requests.push((req_key, rpc_call_future));
+                    self.request_dedup_map.insert(req_key, vec![sender]);
+                }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn dedup_request(
&mut self,
req_key: BackendRequestIdentifier,
sender: OneshotSender<BackendResponse>,
rpc_call_future: BoxFuture<'static, BackendResponse>,
) {
if let Entry::Vacant(e) = self.request_dedup_map.entry(req_key) {
self.pending_requests.push((req_key, rpc_call_future));
e.insert(vec![sender]);
} else {
match self.request_dedup_map.get_mut(&req_key) {
Some(sender_vec) => {
sender_vec.push(sender);
}
None => {
// Log this and do nothing here, as this should never happen.
// If this does happen it is an unexpected bug.
error!(target: LOG_TARGET, "failed to get current request dedup vector");
}
}
}
}
fn dedup_request(
&mut self,
req_key: BackendRequestIdentifier,
sender: OneshotSender<BackendResponse>,
rpc_call_future: BoxFuture<'static, BackendResponse>,
) {
if let Entry::Vacant(e) = self.request_dedup_map.entry(req_key) {
self.pending_requests.push((req_key, rpc_call_future));
e.insert(vec![sender]);
} else {
match self.request_dedup_map.get_mut(&req_key) {
Some(sender_vec) => {
sender_vec.push(sender);
}
None => {
// This should never happen as we just checked the key exists
error!(
target: LOG_TARGET,
key = ?req_key,
"Bug: Failed to get request dedup vector for existing key"
);
// Recover by treating it as a new request
self.pending_requests.push((req_key, rpc_call_future));
self.request_dedup_map.insert(req_key, vec![sender]);
}
}
}
}

@kariy kariy merged commit d67a5fd into dojoengine:main Dec 19, 2024
13 of 14 checks passed
@cwkang1998 cwkang1998 deleted the feat/katana-dedup branch December 19, 2024 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor katana This issue is related to Katana
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dedup fork requests
3 participants