-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-3.5] Fix the learner promotion changes not being persisted into v3store (bbolt) #19563
Conversation
af23911
to
6ee8abe
Compare
ba0b6ac
to
87f5cb2
Compare
746bb27
to
2ca9b2f
Compare
cc @fuweid @ivanvc @serathius @jmhbnz @siyuanfoundation PTAL This PR is the first item of the proposed actions mentioned in #19557 (comment) Once this gets merged, I will work on the second item "auto sync v3store based on v2store on bootstrap" in release-3.5 instead of release-3.6. |
2ca9b2f
to
ab38710
Compare
/retest |
ab38710
to
49f26c3
Compare
This PR seems to fix the bug in learner promotion, but doesn't patches store for clusters that already encountered the bug. Meaning that they will still need to run the EDIT: Fixed "leader promotion" --> "learner promotion". |
"leader promotion" --> "learner promotion". Yes, exactly.
It will be covered in a followup PR, also mentioned this multiple times. Please see the second item as mentioned in #19557 (comment)
Already clarified this several times, please see the proposed action and also #19557 (comment) |
@@ -220,8 +220,23 @@ func unsafeSaveMemberToStore(lg *zap.Logger, s v2store.Store, m *Member) error { | |||
lg.Panic("failed to marshal raftAttributes", zap.Error(err)) | |||
} | |||
p := path.Join(MemberStoreKey(m.ID), raftAttributesSuffix) | |||
_, err = s.Create(p, false, string(b), false, v2store.TTLOptionSet{ExpireTime: v2store.Permanent}) | |||
return err | |||
if _, err = s.Create(p, false, string(b), false, v2store.TTLOptionSet{ExpireTime: v2store.Permanent}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a separate issue. Thanks for adding validation! However I'm worried about trying to fix two issues in one PR thus not putting enough attention into understanding each of them and introducing proper testing.
Can we separate PR for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a separate issue.
Yes, it's a separate (but related) minor issue. When adding a new member,
- v3store persists the whole member,
- but v2store only persists the raftAttributes (peerURL, isLearner); so it causes a temprary inconsistency between v2 and v3store, and it depends on the following publish logic to update the attributes (name, clientURL), so it converge soon.
If we don't fix it in this PR, then the verification will fail, otherwise I will have to update the verification to ignore the attribute and only verify the raftAttributes(including peerURLs, isLearner).
Since from users perspective, this isn't an issue. We can also leave it as it's and only verify raftAttribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't fix it in this PR, then the verification will fail, otherwise I will have to update the verification to ignore the attribute and only verify the raftAttributes(including peerURLs, isLearner).
That's why I'm suggesting to send PR to just fix this so we can merge it fast. This could also be backported to v3.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't an issue from user perspective, so suggest to keep it as it's. This issue only affects version 3.5 (and possibly 3.4, though unverified). Since it is a minor issue and not a concern from the user's perspective, let's just let it fade into history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't an issue from user perspective, so suggest to keep it as it's
I mean that I will revert the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted the change.
49f26c3
to
1fadfb1
Compare
…v2store and v3store Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
When promoting a learner, the member is definitely already exist. Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
1fadfb1
to
e3cd57a
Compare
1609443
to
31b7495
Compare
31b7495
to
cd24fe8
Compare
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
cd24fe8
to
a1cb13b
Compare
@@ -1049,10 +1049,9 @@ func TestAddMemberSyncsBackendAndStoreV2(t *testing.T) { | |||
backendMembers: []*Member{alice}, | |||
}, | |||
{ | |||
name: "Adding member should fail if it exists in both", | |||
name: "Adding member should success if it exists in both", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case was meant to prevent same member being added twice, while we are fixing the issue of v3store rejecting learner being promoted. Should the test be updated to separate those two cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For v3store (bbolt), when we add a member or promote a learner, etcd just marshals the member and write the data into bbolt, if it already exists, just overwrites it. This is the behaviour in both release-3.4, release-3.6 and main. But In release-3.5, it returned an error if it already exists, it's a bug and now we fix it in this PR.
Also FYI.
- When we add a member again, the
ValidateConfigurationChange
will fail. - We also have separate unit test case to verify the v2store will return error when adding a member again. (see line 1057 in cluster_test.go).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the test be updated to separate those two cases?
No need. Please see my response above.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahrtr, fuweid, serathius, siyuanfoundation The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Link to #19557
Also added a generic verification on ensure the membership data in v3store is always in sync with v2store.
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.