-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add time-based cleanup for Maven snapshot versions #33420
base: main
Are you sure you want to change the base?
Conversation
Considered Alternative ApproachI considered another way to handle Maven snapshot cleanup that uses the This method would need a database migration to work, so it's a bit more complicated. I have the code ready for this alternative if we think it's worth the extra steps later on. |
Note to any reviewer: Review this PR carefully. |
A remark: Yes, it's true that I used ChatGPT for help since I'm not an experienced Go developer, but I handled the core logic and testing myself. I'd appreciate any comments or critiques to help us improve this, as we really need the feature :)
|
@@ -101,7 +103,7 @@ func loadPackagesFrom(rootCfg ConfigProvider) (err error) { | |||
Packages.LimitSizeRubyGems = mustBytes(sec, "LIMIT_SIZE_RUBYGEMS") | |||
Packages.LimitSizeSwift = mustBytes(sec, "LIMIT_SIZE_SWIFT") | |||
Packages.LimitSizeVagrant = mustBytes(sec, "LIMIT_SIZE_VAGRANT") | |||
Packages.DefaultRPMSignEnabled = sec.Key("DEFAULT_RPM_SIGN_ENABLED").MustBool(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems to have been deleted accidentally
var metadata MavenMetadata | ||
|
||
dec := xml.NewDecoder(r) | ||
dec.CharsetReader = charset.NewReaderLabel // Assuming charset.NewReaderLabel is a function you've set up to handle character encoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erm...
I mean, it should probably work if I'm reading the documentation correctly, but the comment seems really out of place
dec.CharsetReader = charset.NewReaderLabel // Assuming charset.NewReaderLabel is a function you've set up to handle character encoding. | |
dec.CharsetReader = charset.NewReaderLabel |
@@ -109,3 +131,20 @@ func ParsePackageMetaData(r io.Reader) (*Metadata, error) { | |||
Dependencies: dependencies, | |||
}, nil | |||
} | |||
|
|||
// ParseMavenMetadata parses the Maven metadata XML to extract the build number. | |||
func ParseMavenMetaData(r io.Reader) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thoroughly confused by this method signature and docs: Nothing indicates to me that this method parses the entire XML, and throws the result except for the build number away.
} | ||
} | ||
|
||
log.Info("Filtered %d files out of %d total files for version ID %d with maxBuildNumber %d", len(filteredFiles), len(files), versionID, maxBuildNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm… I do not like seeing log statements inside models/
.
That should be the job of the services/
layer.
Thus, what about adding an extra return value (skippedFiles
) that contains all skipped files?
} | ||
} | ||
|
||
log.Info("Filtered %d files out of %d total files for version ID %d with maxBuildNumber %d", len(filteredFiles), len(files), versionID, maxBuildNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info
logs are rare within Gitea.
At most, gather statistics inside a success message after successfully running the cron job, or downgrade it to debug.
} | ||
|
||
func isSnapshotVersion(version string) bool { | ||
return strings.Contains(version, "-SNAPSHOT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it even HasSuffix
?
@@ -226,6 +229,62 @@ func HasFiles(ctx context.Context, opts *PackageFileSearchOptions) (bool, error) | |||
return db.Exist[PackageFile](ctx, opts.toConds()) | |||
} | |||
|
|||
// GetFilesByBuildNumber retrieves all files for a package version with build numbers <= maxBuildNumber. | |||
func GetFilesByBuildNumber(ctx context.Context, versionID int64, maxBuildNumber int) ([]*PackageFile, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that rather GetFilesBelowBuildNumber
instead?
|
||
buildNumber, err := strconv.Atoi(buildNumberStr) | ||
if err != nil { | ||
return 0, fmt.Errorf("failed to convert build number to integer: '%s'", buildNumberStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return 0, fmt.Errorf("failed to convert build number to integer: '%s'", buildNumberStr) | |
return 0, fmt.Errorf("failed to convert maven package build number to integer: '%s'", buildNumberStr) |
|
||
thresholdBuildNumber := maxBuildNumber - retainBuilds | ||
if thresholdBuildNumber <= 0 { | ||
log.Info("No files to clean up, as the threshold build number is less than or equal to zero for versionID %d", versionID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the error messages and logs in general, I recommend explicitly stating during what sort of action they occurred.
When you actually see these logs, it will be hard to guess that they belong to the Maven snapshot cleanup otherwise, especially when a request happens in the meantime.
return nil | ||
} | ||
|
||
func extractMaxBuildNumberFromMetadata(ctx context.Context, metadataFile *packages.PackageFile) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't you define this method in models
already?
Pull Request: Implement Cleanup Function for Maven Snapshot Versions
Overview
This pull request introduces a cleanup function for Maven snapshot versions in Gitea, enabling more efficient management of package storage. The new feature allows users to specify how many of the most recent Maven snapshot builds to retain, optimizing storage by automatically removing older files.
Features
RETAIN_MAVEN_SNAPSHOT_BUILDS
allows setting the number of Maven snapshot builds to keep.-1
, which keeps all builds.Implementation
The feature extends the existing cleanup job to include files from Maven snapshot versions based on the specified retention policy set in
app.ini
. It checks against the highest build number from the Maven metadata file to determine which files to retain.Impact
This enhancement helps manage disk space more effectively by providing control over how many builds of Maven snapshots are retained, potentially reducing storage requirements for projects using Maven within Gitea.