Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/manifest: maintain BlobFileMetadata references #4341

Merged
merged 2 commits into from
Feb 26, 2025

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Feb 13, 2025

Extend the BlobReference type with a pointer to a shared BlobFileMetadata
struct describing the blob file. Apply reference counting to these
BlobFileMetadata to track the lifecycle of a blob file.

Informs #112.

Add reference counting on TableMetadata. A TableMetadata with a positive
reference count maintains a reference on its FileBacking. This layer of
reference counting is in preparation for blob files. A TableMetadata with a
positive reference count will also maintain a positive reference on all
referenced blob files.

An alternative would be to reference and dereference a TableMetadata's blob
files directly everywhere we reference its FileBacking today. We expect tables
to have many blob references, and this would introduce a large number of atomic
increments and decrements during LevelMetadata B-Tree operations. With this
change, the TableMetadata provides an interstitial reference count, and we only
need to update the blob reference counts when the TableMetadata's reference
count falls to zero.
@jbowens jbowens requested a review from a team as a code owner February 13, 2025 22:40
@jbowens jbowens requested a review from sumeerbhola February 13, 2025 22:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 8 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)


-- commits line 6 at r1:
[nit] It might be cleaner if you moved the refcount to a BlobReferences type that contains that ref count and the slice.

@jbowens
Copy link
Collaborator Author

jbowens commented Feb 14, 2025

TFTR!

@jbowens
Copy link
Collaborator Author

jbowens commented Feb 14, 2025

I decided the O(# sstables) additional heap objects weren’t worth it, and it’s still conceptually clean to ref count the tables themselves

@RaduBerinde
Copy link
Member

Sure, though the struct I suggested would be a field, not a separate object.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 4 files at r1.
Reviewable status: 1 of 8 files reviewed, 2 unresolved discussions (waiting on @jbowens and @RaduBerinde)


internal/manifest/version.go line 246 at r1 (raw file):

	// released.
	//
	// The reference count is the number of versions with non-zero reference

Is this really the number of versions, or the number of distinct nodes that hold a reference?

I am trying to understand the choice to add two levels of ref counting, versus continuing to do what we currently do by reaching into FileBacking in node.decRef and decrementing that ref count. The prototype in blobt2 had code in Version.unrefFiles for that. Was the choice in this PR motivated by having cleaner abstraction boundaries in that the btree would simply call TableMetadata.Unref? Hmm, maybe not, because we could have TableMetadata.Unref only do unreffing of the backing and blob references and add them to obsoleteFiles.

Could you elaborate?

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 8 files reviewed, 2 unresolved discussions (waiting on @jbowens and @RaduBerinde)


internal/manifest/version.go line 246 at r1 (raw file):

Previously, sumeerbhola wrote…

Is this really the number of versions, or the number of distinct nodes that hold a reference?

I am trying to understand the choice to add two levels of ref counting, versus continuing to do what we currently do by reaching into FileBacking in node.decRef and decrementing that ref count. The prototype in blobt2 had code in Version.unrefFiles for that. Was the choice in this PR motivated by having cleaner abstraction boundaries in that the btree would simply call TableMetadata.Unref? Hmm, maybe not, because we could have TableMetadata.Unref only do unreffing of the backing and blob references and add them to obsoleteFiles.

Could you elaborate?

Never mind the question about two levels of ref counting. The blobt2 branch preceded virtual ssts, so it was relying on the ref in the FileMetadata. It seems awkward to do refs of blob references when cloning a node, like we do now with the FileBacking. This seems cleaner.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 6 files at r2.
Reviewable status: 5 of 8 files reviewed, 4 unresolved discussions (waiting on @jbowens and @RaduBerinde)


internal/manifest/version_edit.go line 925 at r2 (raw file):

		// Deleted holds a list of all blob files that became unreferenced by
		// any sstables, making them obsolete within the resulting version (a
		// zombie if still referenced by previous versions).

is there an invariant that Added intersection Deleted is empty?


internal/manifest/version_edit.go line 933 at r2 (raw file):

		// to a file introduced in VersionEdit.AddedBlobFiles or a file
		// referenced by a deleted sstable.
		DeletedReferences map[base.DiskFileNum]*BlobFileMetadata

Is this so that we can make BlobReference.Metadata non-nil in BulkVersionEdit.Accumulate when opening the DB?

I think the prototype does this fixing in BulkVersionEdit.Apply and it was probably a quick decision without much thought. I definitely didn't think through the fact that we have the information in the VersionEdit.

btw, I didn't read the code in version_edit.go carefully since @RaduBerinde already reviewed it.

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: 5 of 8 files reviewed, 4 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)


-- commits line 6 at r1:

Previously, RaduBerinde wrote…

[nit] It might be cleaner if you moved the refcount to a BlobReferences type that contains that ref count and the slice.

I decided that O(sstables) additional objects on the heap wasn’t worth it


internal/manifest/version.go line 246 at r1 (raw file):

Previously, sumeerbhola wrote…

Never mind the question about two levels of ref counting. The blobt2 branch preceded virtual ssts, so it was relying on the ref in the FileMetadata. It seems awkward to do refs of blob references when cloning a node, like we do now with the FileBacking. This seems cleaner.

Done.


internal/manifest/version_edit.go line 925 at r2 (raw file):

Previously, sumeerbhola wrote…

is there an invariant that Added intersection Deleted is empty?

Yeah, documented


internal/manifest/version_edit.go line 933 at r2 (raw file):

Previously, sumeerbhola wrote…

Is this so that we can make BlobReference.Metadata non-nil in BulkVersionEdit.Accumulate when opening the DB?

I think the prototype does this fixing in BulkVersionEdit.Apply and it was probably a quick decision without much thought. I definitely didn't think through the fact that we have the information in the VersionEdit.

btw, I didn't read the code in version_edit.go carefully since @RaduBerinde already reviewed it.

Yeah, that's right

Extend the BlobReference type with a pointer to a shared BlobFileMetadata
struct describing the blob file. Apply reference counting to these
BlobFileMetadata to track the lifecycle of a blob file.

Informs cockroachdb#112.
@jbowens jbowens merged commit c06cd74 into cockroachdb:master Feb 26, 2025
23 checks passed
@jbowens jbowens deleted the blob-meta branch February 26, 2025 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants