Prevent cluster restart after offline upgrade #178
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes a bug that caused a second cluster restart to occur after a successful offline upgrade. The cached pod facts weren't getting invalidated. So after the upgrade, it will proceed to the restart reconciler and using the stale facts think that a cluster restart is needed. Adding an invalidate to the restart reconciler fixed this problem.
Also fixing a minor issue with online upgrade. We could try to restart the primaries with 'AT -t restart_node'. This will fail because the cluster is read-only. Added a wait mechanism so that we only proceed with the primary restart once the server moves the secondaries into read-only mode. This last issue was minor, so I didn't bother adding a changie entry for it.