feat(logstore): provide more opportunities to read #20546

kwannoel · 2025-02-20T08:41:09Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

At the writer future sleep briefly after:

Yielding barrier. This will give read future a brief chance to read some records.
Chunk buffer is 90% full. This will give read future a brief chance to read some records, and avoid logstore flush.

Checklist

I have written necessary rustdoc comments.
I have added necessary unit tests and integration tests.
I have added test labels as necessary.
I have added fuzzing tests or opened an issue to track them.
My PR contains breaking changes.
My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

My PR needs documentation updates.

Release note

kwannoel · 2025-02-20T08:41:28Z

feat(logstore): provide more opportunities to read #20546 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

wenym1 · 2025-02-20T11:17:48Z

src/stream/src/executor/sync_kv_log_store.rs

@@ -144,6 +150,11 @@ struct FlushedChunkInfo {
 }

 enum WriteFuture<S: LocalStateStore> {
+    Paused {
+        duration: Duration,


We should store the sleep future instead of the duration. The sleep future is created in the next_event call and only store in the future returned from next_event. However, the next_event future can be easily dropped because of select, and every time the read future gets ready, the next_event future will be created again, and the sleep time is reset. Under this circumstance, the write future will easily starved by the read future.

wenym1 · 2025-02-20T11:32:51Z

src/stream/src/executor/sync_kv_log_store.rs

-                                            );
+                                            // If buffer 90% full, pause the stream for a while, let downstream do some processing
+                                            // to avoid flushing.
+                                            if buffer.buffer.len() >= self.buffer_size * 9 / 10 {


We'd better determine whether to pause by the number of rows rather than the item count.

More importantly, we shouldn't always pause the write future when the buffer is almost full, because in this way, the slowness in the reader side will always block the writer and the up-down stream won't be decoupled.

In my rough design, we should only pause when we transition from a clean in-memory state to flushed state. The clean in-memory state means when no pending data is in storage, and all chunks can be retrieved from the buffer without reading from storage, and the second state is all other circumstances.

More specifically, we may pause for a while only when we were previously in clean in-memory state, and in either of the following scenarios:

when receive a chunk and will write a chunk to storage

when receive a checkpoint barrier and going to write all unflushed chunks to storage.

When this happens and we may store the item we receive when we pause, and then after the pause sleep, re-apply the item to buffer and storage.

And when we are not in the clean in-memory state, we don't have to pause.

The clean in-memory state means when no pending data is in storage, and all chunks can be retrieved from the buffer without reading from storage, and the second state is all other circumstances.

Seems like we have consider case where there's historical data being read from the logstore (via read_persisted_log_store), on top of reading from flushed chunk future.

Not sure if there's a simple way to check if there's data for the logstore in the previous checkpointed epoch, on recovery.

wenym1 · 2025-02-27T04:34:44Z

src/stream/src/executor/sync_kv_log_store.rs

+                ..
+            } => {
+                sleep_future.await;
+                let (opt, stream) = future.await;


When we have passed the sleep_future.await and gets pending at future.await, if this future of next_future is dropped and recreated, we will still be at WriteFuture::Paused, and then we will poll the sleep_future again. However, the sleep_future have been ready, and the behavior of polling it is undefined, may be pending forever, or panic.

wenym1 · 2025-02-27T04:51:44Z

src/stream/src/executor/sync_kv_log_store.rs

-                                                stream,
-                                                write_state,
-                                            );
+                                            // If buffer full, pause the stream for a while, let downstream do some processing


If we are implementing the idea proposed in #20546 (comment), we should not always set the write future to paused when we have flushed data, but only pause for once when we transit from clean state to flushed state.

The general logic should be like, when we are handling a message that is going to trigger a flush to storage, (either on a chunk when buffer is full, or on a checkpoint barrier when buffer has unflushed chunk), if the state is currently clean, we may pause for a while, and then reapply the message anyway.

The pause is always triggered by an upstream message. Therefore, we can store the upstream message in WriteFuture::Pause, and then when the sleep finish, we can yield this upstream message to reapply the message.

The handle logic should be like

let mut clean_state = check initial clean state. select!{ message = write_future => { match message { barrier => { if clean_state && going to flush some unflushed chunk { write_future = paused(barrier); clean_state = false; } else { ...handle the barrier } } chunk => { if clean_state && buffer full { write_future = paused(chunk); clean_state = false; } else { ...handle the chunk } } } } chunk = read_future { if !clean_state && now the state is clean { clean_state = true; } ... } }

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

kwannoel · 2025-02-27T16:23:34Z

Should have fixed the comments, will have a second look tomorrow.

I also added another read state, to try to fetch a single chunk, so we know if we are in a dirty or clean state initially. c588e6f.

kwannoel · 2025-02-28T02:38:36Z

Seems I encountered some other bug in unaligned-join. Investigating

github-actions bot added the Invalid PR Title label Feb 20, 2025

kwannoel changed the title ~~add paused variant~~ feat(logstore): provide more opportunities to read Feb 20, 2025

github-actions bot added type/feature and removed Invalid PR Title labels Feb 20, 2025

kwannoel marked this pull request as ready for review February 20, 2025 08:44

kwannoel requested a review from wenym1 February 20, 2025 08:44

wenym1 reviewed Feb 20, 2025

View reviewed changes

kwannoel force-pushed the kwannoel/pass-chunks branch 2 times, most recently from af96993 to 8f8cd46 Compare February 27, 2025 01:55

kwannoel requested a review from wenym1 February 27, 2025 02:06

wenym1 reviewed Feb 27, 2025

View reviewed changes

xxchan requested a review from Copilot February 27, 2025 05:55

Copilot AI reviewed Feb 27, 2025

View reviewed changes

kwannoel added 10 commits February 27, 2025 16:42

add paused variant

5e85463

pause writer stream briefly

f407f0b

make configurable

260f0a8

fmt

cf7cd73

store future

f5257ac

add current_size

58127f2

add write only when full + fix size calculation

17811c5

check if reading from storage

d82ef09

avoid invalid sleep future state + handle clean state

ebd7c30

read a single chunk from storage if possible initially

c588e6f

kwannoel force-pushed the kwannoel/pass-chunks branch from 857ff65 to c588e6f Compare February 27, 2025 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(logstore): provide more opportunities to read #20546

feat(logstore): provide more opportunities to read #20546

kwannoel commented Feb 20, 2025 •

edited

Loading

kwannoel commented Feb 20, 2025

wenym1 Feb 20, 2025

wenym1 Feb 20, 2025

kwannoel Feb 25, 2025 •

edited

Loading

wenym1 Feb 27, 2025

wenym1 Feb 27, 2025

kwannoel commented Feb 27, 2025

kwannoel commented Feb 28, 2025 •

edited

Loading

feat(logstore): provide more opportunities to read #20546

Are you sure you want to change the base?

feat(logstore): provide more opportunities to read #20546

Conversation

kwannoel commented Feb 20, 2025 • edited Loading

What's changed and what's your intention?

Checklist

Documentation

kwannoel commented Feb 20, 2025

wenym1 Feb 20, 2025

Choose a reason for hiding this comment

wenym1 Feb 20, 2025

Choose a reason for hiding this comment

kwannoel Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

wenym1 Feb 27, 2025

Choose a reason for hiding this comment

wenym1 Feb 27, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwannoel commented Feb 27, 2025

kwannoel commented Feb 28, 2025 • edited Loading

kwannoel commented Feb 20, 2025 •

edited

Loading

kwannoel Feb 25, 2025 •

edited

Loading

kwannoel commented Feb 28, 2025 •

edited

Loading