Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: influxd upgrade command #19591

Closed
wants to merge 2 commits into from

Conversation

vlastahajek
Copy link
Contributor

@vlastahajek vlastahajek commented Sep 18, 2020

Closes #19308

InfluxDB server command to upgrade 1.x config, database and users to 2.0.

Features:

  • Upgrades config file
  • Upgrades database files
  • Creates script for re-creating users and their permissions

TODOs:

  • E2E test

Minimal command line includes parameters for creating admin: influxd upgrade -b my-bucket -o my-org -u my-user -p my-password

Complete help:

    Upgrades a 1.x version of InfluxDB by performing following actions:
      1. Reads 1.x config file and creates 2.x config file with matching options. Unsupported 1.x options are reported.
      2. Copies and upgrades 1.x database files.
      3. Creates a script for creating 1.x users and their permissions. This scripts needs to be revised and run manually after starting 2.x.

    If config file is not available, 1.x db folder (--v1-dir options) is taken as an input.
    Target 2.x database dir is specified by the --engine-path option. If changed, bolt path should to be changed as well.

Usage:
  influxd upgrade [flags]
  influxd upgrade [command]

Available Commands:
  v1-dump-meta Dump InfluxDB 1.x meta.db
  v2-dump-meta Dump InfluxDB 2.x influxd.bolt

Flags:
  -m, --bolt-path string         path for boltdb database (default "/home/ubuntu/.influxdbv2/influxd.bolt")
  -b, --bucket string            primary bucket name
      --config-file string       optional: Custom InfluxDB 1.x config file path, else the default config file
  -e, --engine-path string       path for persistent engine files (default "/home/ubuntu/.influxdbv2/engine")
  -h, --help                     help for upgrade
      --log-path string          optional: custom log file path (default "/home/ubuntu/upgrade.log")
  -o, --org string               primary organization name
  -p, --password string          password for username
  -r, --retention string         optional: duration bucket will retain data. 0 is infinite. Default is 0.
      --security-script string   optional: generated security upgrade script path (default "/home/ubuntu/influxd-upgrade-security.sh")
  -t, --token string             optional: token for username, else auto-generated
  -u, --username string          primary username
      --v1-dir string            path to source 1.x db directory containing meta,data and wal sub-folders (default "/home/ubuntu/.influxdb")
  -v, --verbose                  verbose output (default true)

@vlastahajek vlastahajek marked this pull request as draft September 18, 2020 19:07
@vlastahajek vlastahajek changed the title feat: influxd upgrade command Feat: influxd upgrade command Sep 18, 2020
@alespour alespour changed the title Feat: influxd upgrade command feat: influxd upgrade command Sep 21, 2020
@e-dard
Copy link
Contributor

e-dard commented Sep 22, 2020

A couple of early comments @vlastahajek:

  • please run go mod tidy to fix the lint CI failure;
  • please ensure all commits are following the semantic commit conventions.

@vlastahajek
Copy link
Contributor Author

@e-dard , sure, we will clean commits before marking this as ready. I will leave go mod tidy to @alespour

Copy link
Contributor

@e-dard e-dard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @vlastahajek.

Whilst I'm not averse to having some integration style tests in the repo against a small test set of real-world files (index or series files etc) I think that the contents that's been added here via the data and wal directories is too significant.

I can see two ways to improve this:

  1. you could programatically create the files during tests;
  2. you could just zip everything up into a single archive that is uncompressed into a temporary location (and then also cleaned up) during tests.

Alternatively I seem to recall there possibly being an alternative repo setup for test fixtures but I'm not sure if that impacts the influxdb repo or not.

@@ -0,0 +1,4 @@
# Ignore everything in this directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all of these .gitignore files. Perhaps we can help you achieve what you were intending to do with these file a different way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These files were added to allow committing the wal folder with empty dirs. As stated below, this will be cleaned.

@vlastahajek
Copy link
Contributor Author

@e-dard, I prefer to have a pre-created dataset to rest a real upgrade scenario. I will zip the test data in the repo..

Copy link
Contributor

@rogpeppe rogpeppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a bunch of comments, but I'm afraid this PR is rather to big to review in its entirety. Perhaps it could be split up into a bunch of independent parts working from the bottom up?

In general, the approach seems good though, thanks!

// read each database / retention policy from v1.meta and create bucket db-name/rp-name
// create database in v2.meta
// copy shard info from v1.meta
if len(v1.meta.Databases()) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about inverting the condition and returning early when there are no databases to save a bunch of indented code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

}
size += size2
v2dir := filepath.Dir(options.target.boltPath)
diskInfo, err := fs2.DiskUsage(v2dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than import this very OS-dependent code in order to do a fairly inaccurate check (sum of file sizes doesn't equate to used disk space), it might be best just to let the user decide for themselves and provide a "dry run" mode that just says how much new disk space will be needed or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose user databases have size in hundreads of MB and probably more.. Do you think that it will be such a big difference that these will give false result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual error can be just because filesystem entries size and block size rounding, but comparing to the datafile sizes this shouldn't be much difference.
Even available disk space can be different during copying data, due to the other processes work.
This just a sanity check. If disk available size is very close to the requires size (condition when the check inaccuracy matters) is even bad for a system.

On the other hand, dry run mode is a new feature and can be added later..

if err != nil {
return nil, err
}
if canOnboard {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invert the condition and return early?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

orgName string
bucket string
token string
retention string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
retention string
retention time.Duration

Then you won't have to go to the trouble of parsing it yourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the one bellow is done based on the setup command. So I suppose there is a good reason to have this way..

return dir
}

func rawDurationToTimeDuration(raw string) (time.Duration, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is needed when we already have time.ParseDuration and the flag package already directly supports time.Duration-typed flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Actually, this is copied from the setup command, but it would be best to reuse this. Could you, please, suggest a good place to put it?

@vlastahajek vlastahajek marked this pull request as ready for review September 23, 2020 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In-place OSS 1.x to 2.x upgrade
5 participants