-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/manifest: maintain BlobFileMetadata references #4341
Conversation
Add reference counting on TableMetadata. A TableMetadata with a positive reference count maintains a reference on its FileBacking. This layer of reference counting is in preparation for blob files. A TableMetadata with a positive reference count will also maintain a positive reference on all referenced blob files. An alternative would be to reference and dereference a TableMetadata's blob files directly everywhere we reference its FileBacking today. We expect tables to have many blob references, and this would introduce a large number of atomic increments and decrements during LevelMetadata B-Tree operations. With this change, the TableMetadata provides an interstitial reference count, and we only need to update the blob reference counts when the TableMetadata's reference count falls to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)
-- commits
line 6 at r1:
[nit] It might be cleaner if you moved the refcount to a BlobReferences
type that contains that ref count and the slice.
TFTR! |
I decided the O(# sstables) additional heap objects weren’t worth it, and it’s still conceptually clean to ref count the tables themselves |
Sure, though the struct I suggested would be a field, not a separate object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r1.
Reviewable status: 1 of 8 files reviewed, 2 unresolved discussions (waiting on @jbowens and @RaduBerinde)
internal/manifest/version.go
line 246 at r1 (raw file):
// released. // // The reference count is the number of versions with non-zero reference
Is this really the number of versions, or the number of distinct nodes that hold a reference?
I am trying to understand the choice to add two levels of ref counting, versus continuing to do what we currently do by reaching into FileBacking
in node.decRef
and decrementing that ref count. The prototype in blobt2 had code in Version.unrefFiles
for that. Was the choice in this PR motivated by having cleaner abstraction boundaries in that the btree would simply call TableMetadata.Unref
? Hmm, maybe not, because we could have TableMetadata.Unref
only do unreffing of the backing and blob references and add them to obsoleteFiles
.
Could you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 8 files reviewed, 2 unresolved discussions (waiting on @jbowens and @RaduBerinde)
internal/manifest/version.go
line 246 at r1 (raw file):
Previously, sumeerbhola wrote…
Is this really the number of versions, or the number of distinct nodes that hold a reference?
I am trying to understand the choice to add two levels of ref counting, versus continuing to do what we currently do by reaching into
FileBacking
innode.decRef
and decrementing that ref count. The prototype in blobt2 had code inVersion.unrefFiles
for that. Was the choice in this PR motivated by having cleaner abstraction boundaries in that the btree would simply callTableMetadata.Unref
? Hmm, maybe not, because we could haveTableMetadata.Unref
only do unreffing of the backing and blob references and add them toobsoleteFiles
.Could you elaborate?
Never mind the question about two levels of ref counting. The blobt2 branch preceded virtual ssts, so it was relying on the ref in the FileMetadata
. It seems awkward to do refs of blob references when cloning a node, like we do now with the FileBacking
. This seems cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r2.
Reviewable status: 5 of 8 files reviewed, 4 unresolved discussions (waiting on @jbowens and @RaduBerinde)
internal/manifest/version_edit.go
line 925 at r2 (raw file):
// Deleted holds a list of all blob files that became unreferenced by // any sstables, making them obsolete within the resulting version (a // zombie if still referenced by previous versions).
is there an invariant that Added intersection Deleted is empty?
internal/manifest/version_edit.go
line 933 at r2 (raw file):
// to a file introduced in VersionEdit.AddedBlobFiles or a file // referenced by a deleted sstable. DeletedReferences map[base.DiskFileNum]*BlobFileMetadata
Is this so that we can make BlobReference.Metadata
non-nil in BulkVersionEdit.Accumulate
when opening the DB?
I think the prototype does this fixing in BulkVersionEdit.Apply
and it was probably a quick decision without much thought. I definitely didn't think through the fact that we have the information in the VersionEdit.
btw, I didn't read the code in version_edit.go carefully since @RaduBerinde already reviewed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Reviewable status: 5 of 8 files reviewed, 4 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)
Previously, RaduBerinde wrote…
[nit] It might be cleaner if you moved the refcount to a
BlobReferences
type that contains that ref count and the slice.
I decided that O(sstables) additional objects on the heap wasn’t worth it
internal/manifest/version.go
line 246 at r1 (raw file):
Previously, sumeerbhola wrote…
Never mind the question about two levels of ref counting. The blobt2 branch preceded virtual ssts, so it was relying on the ref in the
FileMetadata
. It seems awkward to do refs of blob references when cloning a node, like we do now with theFileBacking
. This seems cleaner.
Done.
internal/manifest/version_edit.go
line 925 at r2 (raw file):
Previously, sumeerbhola wrote…
is there an invariant that Added intersection Deleted is empty?
Yeah, documented
internal/manifest/version_edit.go
line 933 at r2 (raw file):
Previously, sumeerbhola wrote…
Is this so that we can make
BlobReference.Metadata
non-nil inBulkVersionEdit.Accumulate
when opening the DB?I think the prototype does this fixing in
BulkVersionEdit.Apply
and it was probably a quick decision without much thought. I definitely didn't think through the fact that we have the information in the VersionEdit.btw, I didn't read the code in version_edit.go carefully since @RaduBerinde already reviewed it.
Yeah, that's right
Extend the BlobReference type with a pointer to a shared BlobFileMetadata struct describing the blob file. Apply reference counting to these BlobFileMetadata to track the lifecycle of a blob file. Informs cockroachdb#112.
Extend the BlobReference type with a pointer to a shared BlobFileMetadata
struct describing the blob file. Apply reference counting to these
BlobFileMetadata to track the lifecycle of a blob file.
Informs #112.