Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pull-based Ingestion] Support segment replication for pull-based ingestion #17359

Merged
merged 4 commits into from
Feb 27, 2025

Conversation

varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Feb 14, 2025

Description

This PR is a follow up for pull-based-ingestion to support segment replication with remote store. The primary shard will ingest from the streaming source and replica shards will rely on segment replication.

This PR refactors IngestionEngine to inherit from InternalEngine to support replication, recovery and avoid duplicate code. Some of the changes required to support segRep and peer recovery are enhancing IngestionEngine to include required listeners, support working with NRTReplicationEngine, tracking latest index commits, prevent snapshotted index deletion, among many others. These changes are already available in InternalEngine, and can be reused by IngestionEngine after this change.

Integration tests are added to validate end-to-end pull-based ingestion with segment replication, peer recover, replica promotion and remote store.

Related Issues

Resolves #16929

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Feb 14, 2025
Copy link
Contributor

❌ Gradle check result for 7a682ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for bc9716c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj force-pushed the vb/segrep branch 2 times, most recently from b758603 to a7c7a99 Compare February 21, 2025 04:05
Copy link
Contributor

❌ Gradle check result for a7c7a99: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for fae7e91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 22464ed: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some build failures here. FYI you should be able to run ./gradlew precommit locally to find these before pushing your commit.

Copy link
Contributor

✅ Gradle check result for f56f90f: SUCCESS

Copy link

codecov bot commented Feb 26, 2025

Codecov Report

Attention: Patch coverage is 83.01887% with 9 lines in your changes missing coverage. Please review.

Project coverage is 72.53%. Comparing base (0ffed5e) to head (c5d7445).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...opensearch/index/translog/NoOpTranslogManager.java 73.33% 3 Missing and 1 partial ⚠️
...a/org/opensearch/index/engine/IngestionEngine.java 86.36% 1 Missing and 2 partials ⚠️
.../indices/pollingingest/IngestionEngineFactory.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17359      +/-   ##
============================================
+ Coverage     72.42%   72.53%   +0.11%     
- Complexity    65611    65675      +64     
============================================
  Files          5304     5304              
  Lines        304743   304464     -279     
  Branches      44189    44145      -44     
============================================
+ Hits         220701   220840     +139     
+ Misses        65888    65495     -393     
+ Partials      18154    18129      -25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

❌ Gradle check result for bb213a7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

…ecovery

Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Signed-off-by: Varun Bharadwaj <varunbharadwaj1995@gmail.com>
Copy link
Contributor

✅ Gradle check result for c5d7445: SUCCESS

@yupeng9
Copy link
Contributor

yupeng9 commented Feb 27, 2025

LGTM

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty reasonable to me as a first step. In the long run, I think I'd like to extract an abstract base class for both InternalEngine and IngestionEngine, but I think that belongs in another PR, since this one's already big enough.

Thanks, @varunbharadwaj!

@varunbharadwaj
Copy link
Contributor Author

This looks pretty reasonable to me as a first step. In the long run, I think I'd like to extract an abstract base class for both InternalEngine and IngestionEngine, but I think that belongs in another PR, since this one's already big enough.

Thanks, @varunbharadwaj!

Thanks for reviewing. Yeah, will create a follow up PR for the refactoring.

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, lgtm!

@mch2 mch2 merged commit 415abb9 into opensearch-project:main Feb 27, 2025
38 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] A new IngestionEngine that can pull data from streaming sources.
6 participants