-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete files in multiple ranges at once #3431
Delete files in multiple ranges at once #3431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good change, thanks :). couple more questions:
- do you mind updating
HISTORY.md
to mention this new public API? - do you have any numbers to share about how long deleting files in ranges took before/after this change?
include/rocksdb/db.h
Outdated
@@ -100,6 +100,14 @@ struct Range { | |||
Range(const Slice& s, const Slice& l) : start(s), limit(l) { } | |||
}; | |||
|
|||
struct RangePtr { | |||
const Slice* start; // Included in the range | |||
const Slice* limit; // Not included in the range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding of GetCleanInputsWithinInterval
is that it'll include the upper endpoint -- do you mind double checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I just copy it from struct Range
. But we are actually intent on supporting exclusive range end in DeleteFilesInRange/DeleteFilesInRanges
. Do you think it is OK to add that support in this PR, and make the range end exclusive by default in the new API DeleteFilesInRanges
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add support in this PR sounds fine. But I think it'd be more consistent to make it inclusive-inclusive, with an option to make the limit exclusive. This PR (#2863) did a good job of adding the option for DeleteFilesInRange
, but unfortunately we didn't merge it due to an unsolved test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's delete // Not included in the range
since it's optional
include/rocksdb/convenience.h
Outdated
@@ -330,6 +330,8 @@ void CancelAllBackgroundWork(DB* db, bool wait = false); | |||
// Snapshots before the delete might not see the data in the given range. | |||
Status DeleteFilesInRange(DB* db, ColumnFamilyHandle* column_family, | |||
const Slice* begin, const Slice* end); | |||
Status DeleteFilesInRange(DB* db, ColumnFamilyHandle* column_family, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call the new API DeleteFilesInRanges
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll share some numbers later :)
include/rocksdb/convenience.h
Outdated
@@ -330,6 +330,8 @@ void CancelAllBackgroundWork(DB* db, bool wait = false); | |||
// Snapshots before the delete might not see the data in the given range. | |||
Status DeleteFilesInRange(DB* db, ColumnFamilyHandle* column_family, | |||
const Slice* begin, const Slice* end); | |||
Status DeleteFilesInRange(DB* db, ColumnFamilyHandle* column_family, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
include/rocksdb/db.h
Outdated
@@ -100,6 +100,14 @@ struct Range { | |||
Range(const Slice& s, const Slice& l) : start(s), limit(l) { } | |||
}; | |||
|
|||
struct RangePtr { | |||
const Slice* start; // Included in the range | |||
const Slice* limit; // Not included in the range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I just copy it from struct Range
. But we are actually intent on supporting exclusive range end in DeleteFilesInRange/DeleteFilesInRanges
. Do you think it is OK to add that support in this PR, and make the range end exclusive by default in the new API DeleteFilesInRanges
?
Using DeleteFilesInRange to delete files in a lot of ranges can be slow, because VersionSet::LogAndApply is expensive. This PR adds a new DeleteFilesInRange function to delete files in multiple ranges at once.
@huachaohuang has updated the pull request. |
@huachaohuang has updated the pull request. View: changes |
Calling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
include/rocksdb/db.h
Outdated
@@ -100,6 +100,14 @@ struct Range { | |||
Range(const Slice& s, const Slice& l) : start(s), limit(l) { } | |||
}; | |||
|
|||
struct RangePtr { | |||
const Slice* start; // Included in the range | |||
const Slice* limit; // Not included in the range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's delete // Not included in the range
since it's optional
remove comments about range inclusion vs exclusion since it should be specified in the APIs.
@huachaohuang has updated the pull request. View: changes, changes since last import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajkr is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Using `DeleteFilesInRange` to delete files in a lot of ranges can be slow, because `VersionSet::LogAndApply` is expensive. This PR adds a new `DeleteFilesInRange` function to delete files in multiple ranges at once. Close facebook#2951 Closes facebook#3431 Differential Revision: D6849228 Pulled By: ajkr fbshipit-source-id: daeedcabd8def4b1d9ee95a58266dee77b5d68cb
Summary: Using `DeleteFilesInRange` to delete files in a lot of ranges can be slow, because `VersionSet::LogAndApply` is expensive. This PR adds a new `DeleteFilesInRange` function to delete files in multiple ranges at once. Close facebook#2951 Closes facebook#3431 Differential Revision: D6849228 Pulled By: ajkr fbshipit-source-id: daeedcabd8def4b1d9ee95a58266dee77b5d68cb
Using
DeleteFilesInRange
to delete files in a lot of ranges can be slow, becauseVersionSet::LogAndApply
is expensive.This PR adds a new
DeleteFilesInRange
function to delete files in multipleranges at once.
Close #2951