Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist updates between server restarts #1903

Merged
merged 23 commits into from
Mar 31, 2025

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented Mar 26, 2025

So far, delta triples due to updates were lost when the server was restarted. This change introduces an option --persist-updates, which has the following effect:

  1. After each update, the complete set of delta triples (inserted and deleted) as well as the associated local vocabulary is written to disk, to a file <name>.update-triples, where <name> is the basename of the index files.

  2. When restarting the server, that file is read and the update triples are processed in such a way that the server ends up in the same state it was in when that file was written.

  3. When restarting the server without the option --persist-updates or if there is no file <name>.update-triples, the server is in the same state it would be in if there are no update triples (or after the command clear-delta-triples).

NOTE: This is a first implementation, which provides the basic functionality. After each update, the complete set of delta triples is written, even if that update inserted or removed only few triples relative to the previous updates. If an error occurs during an update operation, the server state is still undefined and one effectively loses all delta triples. This is work for future PRs.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick draft.
I think I have found a bug:)

ad_utility::serializeIds(
tempPath, localVocab_,
std::array{toRange(triplesDeleted_), toRange(triplesInserted_)});
std::filesystem::rename(tempPath, storagePath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do something if the renaming fails.

Comment on lines +72 to +73
AD_CONTRACT_CHECK(vocab.numSets() == 1);
const auto& words = vocab.primaryWordSet();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be (at least in a TODO) extended to more than one set.

@hannahbast hannahbast changed the title Allow updates to persist between restarts Persist delta triples between server restarts Mar 26, 2025
@hannahbast hannahbast changed the title Persist delta triples between server restarts Persist updates between server restarts Mar 26, 2025
@hannahbast
Copy link
Member

@RobinTF Thanks a lot for this. I had a look at this together with @joka921 and it worked in principle, great! I wrote a description above, please check. Here are some issues that should be addressed before this can be merged:

  1. When the local vocab is empty, a server restarts triggers an assertion failure, namely in util/Serializer/TripleSerializer.h:89; this should be fixed (maybe simply by removing the assertion?)
  2. The file written to disk should be called <name>.update-triples, where <name> is the basename of the index
  3. There should be an option --persist-updates with the properties described in the description above; in particular, the default behavior should be not to persist updated
  4. When the option is activated and the mentioned file is found, there should be a message in the server log, for example: Option --persist-updatesspecified and found file.update-triples, reading and processing it ...
  5. Tests are still missing

joka921 added 3 commits March 27, 2025 16:36
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
Copy link

codecov bot commented Mar 27, 2025

Codecov Report

Attention: Patch coverage is 82.08955% with 36 lines in your changes missing coverage. Please review.

Project coverage is 90.56%. Comparing base (2b91b1a) to head (0813cc7).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/index/DeltaTriples.cpp 79.45% 12 Missing and 3 partials ⚠️
src/util/Serializer/TripleSerializer.h 92.23% 6 Missing and 2 partials ⚠️
src/engine/Server.cpp 0.00% 5 Missing ⚠️
src/index/IndexImpl.cpp 63.63% 3 Missing and 1 partial ⚠️
src/util/Views.h 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1903      +/-   ##
==========================================
- Coverage   90.56%   90.56%   -0.01%     
==========================================
  Files         414      415       +1     
  Lines       40100    40284     +184     
  Branches     4536     4554      +18     
==========================================
+ Hits        36318    36484     +166     
- Misses       2427     2442      +15     
- Partials     1355     1358       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hannahbast pushed a commit to ad-freiburg/qlever-control that referenced this pull request Mar 27, 2025
This is the accompanying change for the `persists-updates` option
introduced by ad-freiburg/qlever#1903
@hannahbast hannahbast marked this pull request as ready for review March 27, 2025 21:02
joka921 added 2 commits March 31, 2025 19:21
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
@sparql-conformance
Copy link

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it, and it works fine now!

@hannahbast hannahbast merged commit 511657f into ad-freiburg:master Mar 31, 2025
20 of 22 checks passed
@RobinTF RobinTF deleted the persistent-updates branch March 31, 2025 21:05
hannahbast added a commit to ad-freiburg/qlever-control that referenced this pull request Mar 31, 2025
This is the accompanying change for the `persists-updates` option introduced by ad-freiburg/qlever#1903
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants