Skip to content
This repository was archived by the owner on Jan 22, 2025. It is now read-only.

Account path add run parent with old path cleanup #29942

Merged
merged 15 commits into from
Jan 30, 2023

Conversation

xiangzhu70
Copy link
Contributor

@xiangzhu70 xiangzhu70 commented Jan 27, 2023

Problem

PR #29794 was reverted because of OOM. This is the resubmission with the fix.

The new image uses the run/ sub directory for the appendvec files. When transitioning from an old image to this new image, the appendvec files in the old account_path should be cleaned up to avoid taking the space. In the ramfs case, it was taking the memory, causing OOM.

All commits have passed the review in PR 29794 except the last one f8f58f2

Summary of Changes

The fix is to clean up the files in the old path.

Fixes #

@xiangzhu70 xiangzhu70 force-pushed the account_path_add_run_parent branch from 4c9ffb5 to f8f58f2 Compare January 27, 2023 03:15
@xiangzhu70
Copy link
Contributor Author

xiangzhu70 commented Jan 27, 2023

Screen Shot 2023-01-26 at 9 44 17 PM

Now the memory usage looks the same as before the change.

Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future please also mark the PR as Ready For Review. Sometimes it can be ambiguous when you've been requested to review but the PR itself is still a draft.

@xiangzhu70
Copy link
Contributor Author

@behzadnouri Here is the new version which cleans up the appendvecs in the old paths. f8f58f2 is the fix.

@xiangzhu70 xiangzhu70 marked this pull request as ready for review January 27, 2023 18:05
Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work tracking down the issue, glad to see it was a relatively simple resolution.

// If the "run/" or "snapshot" sub directories do not exist, the directory may be from
// an older version for which the appendvec files are at this directory. Clean up
// them first.
// This will be done only once when transitioning from an old image without run directory
// to this new version using run and snapshot directories.
// The run/ content cleanup will be done at a later point. The snapshot/ content persists
// accross the process boot, and will be purged by the account_background_service.
move_and_async_delete_path(account_dir.as_ref());
fs::remove_dir_all(account_dir.as_ref())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fs::remove_dir_all(account_dir.as_ref())?;
fs::remove_dir_all(account_dir)?;

do we need the .as_ref() for some reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to &account_dir.

.collect();
}).map(
|account_path| {
// For all account_paths, set up the run/ and snapshot/ sub directories.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to update this comment to make it clear this will clean the accounts dir if they don't already exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment "// If the sub directories do not exist, the account_path will be cleaned because older version put account files there".

@behzadnouri
Copy link
Contributor

I see this code is also modifying ledger-tool;
Have you verified it does not introduce any regressions to ledger-tool either?

@xiangzhu70
Copy link
Contributor Author

Have you verified it does not introduce any regressions to ledger-tool either?

Yes, verified with
./multinode-demo/setup.sh
./multinode-demo/bootstrap-validator.sh
./cargo run --bin solana-ledger-tool -- --ledger config/bootstrap-validator verify

@xiangzhu70 xiangzhu70 merged commit 8565989 into solana-labs:master Jan 30, 2023
@behzadnouri
Copy link
Contributor

Have you verified it does not introduce any regressions to ledger-tool either?

Yes, verified with ./multinode-demo/setup.sh ./multinode-demo/bootstrap-validator.sh ./cargo run --bin solana-ledger-tool -- --ledger config/bootstrap-validator verify

what does this verify?! this does not create any ledger of comparable size to mainnet to verify or notice regressions.

@xiangzhu70
Copy link
Contributor Author

what does this verify?! this does not create any ledger of comparable size to mainnet to verify or notice regressions.

The ledger tool verify command reads the snapshot generated by the validator. I checked and see the accounts files are in run/ sub directory. What other tests should I run?

@behzadnouri
Copy link
Contributor

what does this verify?! this does not create any ledger of comparable size to mainnet to verify or notice regressions.

The ledger tool verify command reads the snapshot generated by the validator. I checked and see the accounts files are in run/ sub directory. What other tests should I run?

Just like running a node mainnet. Obtain a ledger snapshot from mainnet, and run leger-tool on that

@xiangzhu70
Copy link
Contributor Author

Just like running a node mainnet. Obtain a ledger snapshot from mainnet, and run leger-tool on that

I ran "./cargo run --bin solana-ledger-tool -- --ledger ~/ledger verify" which gets a snapshot from the validator running on mainnet. Is this good?

@@ -74,7 +75,8 @@ struct SnapshotTestConfig {
incremental_snapshot_archives_dir: TempDir,
full_snapshot_archives_dir: TempDir,
bank_snapshots_dir: TempDir,
accounts_dir: TempDir,
accounts_dir: PathBuf,
_accounts_tmp_dir: TempDir,
Copy link
Contributor

@ryoqun ryoqun Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here xref: #31319 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants