Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Preserve finalizers during MS/Machine reconciliation #10694

Merged
merged 1 commit into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 2 additions & 15 deletions internal/controllers/machinedeployment/machinedeployment_sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ import (
"k8s.io/apimachinery/pkg/types"
kerrors "k8s.io/apimachinery/pkg/util/errors"
apirand "k8s.io/apimachinery/pkg/util/rand"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/klog/v2"
"k8s.io/utils/ptr"
Expand Down Expand Up @@ -230,11 +229,6 @@ func (r *Reconciler) computeDesiredMachineSet(ctx context.Context, deployment *c
name, randomSuffix = computeNewMachineSetName(deployment.Name + "-")
uniqueIdentifierLabelValue = fmt.Sprintf("%d-%s", templateHash, randomSuffix)

// Add foregroundDeletion finalizer to MachineSet if the MachineDeployment has it.
if sets.New[string](deployment.Finalizers...).Has(metav1.FinalizerDeleteDependents) {
finalizers = []string{metav1.FinalizerDeleteDependents}
}

replicas, err = mdutil.NewMSNewReplicas(deployment, oldMSs, 0)
if err != nil {
return nil, errors.Wrap(err, "failed to compute desired MachineSet")
Expand All @@ -257,15 +251,8 @@ func (r *Reconciler) computeDesiredMachineSet(ctx context.Context, deployment *c
name = existingMS.Name
Copy link
Member Author

@sbueringer sbueringer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Add foregroundDeletion finalizer to MachineSet if the MachineDeployment has it.
if sets.New[string](deployment.Finalizers...).Has(metav1.FinalizerDeleteDependents) {
    finalizers = []string{metav1.FinalizerDeleteDependents}
}

in l.234

I think this code is basically unreachable. Because we never create a MachineSet after the deletionTimestamp is already set. And the finalizer and the deletionTimestamp seem to be set by the kube-apiserver through the delete call at the same time:

Technically someone could send the foregroundDeletion finalizer already earlier, but I think in any case the finalizer would be propagated down (see e2e test coverage)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@enxebre enxebre May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this bit of code ensure that when you are removing a MD with foreground (via client), then foreground would be also honoured between the owned MS deletion and their child Machines?

Copy link
Member Author

@sbueringer sbueringer May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't change that behavior. I did the following test with this PR:

  • Create Cluster & MD
  • Add additional "test" finalizer to MS & Machine
  • Delete MD with foreground
  • Check that:
    • MD gets the foregroundDeletion finalizer + deletionTimestamp
    • MS gets the foregroundDeletion finalizer + deletionTimestamp
    • Machine gets the deletionTimestamp

Then I:

  • Removed test finalizer from Machine => Machine went away
  • Removed test finalizer from MachinSet => MachineSet and then MachineDeployment went away

Please note that computeDesiredMachineSet is not executed anymore after the deletionTimestamp has been set on MachineDeployment (because we run reconcileDelete instead)

So I think this is:

  • unreachable code (except in the case where someone already sets the foregroundDeletion finalizer on the MD manually before MD deletion)
  • and anyway already additionally done by the kube-controller-manager

uid = existingMS.UID

// Keep foregroundDeletion finalizer if the existingMS has it.
// Note: This case is a little different from the create case. In the update case we preserve
// the finalizer on the MachineSet if it already exists. Because of SSA we should not build
// the finalizer information from the MachineDeployment when updating a MachineSet because that could lead
// to dropping the finalizer from the MachineSet if it is dropped from the MachineDeployment.
// We should not drop the finalizer on the MachineSet if the finalizer is dropped from the MachineDeployment.
if sets.New[string](existingMS.Finalizers...).Has(metav1.FinalizerDeleteDependents) {
finalizers = []string{metav1.FinalizerDeleteDependents}
}
// Preserve all existing finalizers (including foregroundDeletion finalizer).
finalizers = existingMS.Finalizers

replicas = *existingMS.Spec.Replicas

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -642,6 +642,8 @@ func TestComputeDesiredMachineSet(t *testing.T) {
"ms-label-1": "ms-value-1",
}
existingMS.Annotations = nil
// Pre-existing finalizer should be preserved.
existingMS.Finalizers = []string{"pre-existing-finalizer"}
existingMS.Spec.Template.Labels = map[string]string{
clusterv1.MachineDeploymentUniqueLabel: uniqueID,
"ms-label-2": "ms-value-2",
Expand All @@ -657,6 +659,9 @@ func TestComputeDesiredMachineSet(t *testing.T) {
expectedMS.UID = existingMSUID
expectedMS.Name = deployment.Name + "-" + uniqueID
expectedMS.Labels[clusterv1.MachineDeploymentUniqueLabel] = uniqueID
// Pre-existing finalizer should be preserved.
expectedMS.Finalizers = []string{"pre-existing-finalizer"}

expectedMS.Spec.Template.Labels[clusterv1.MachineDeploymentUniqueLabel] = uniqueID

g := NewWithT(t)
Expand All @@ -676,6 +681,8 @@ func TestComputeDesiredMachineSet(t *testing.T) {
"ms-label-1": "ms-value-1",
}
existingMS.Annotations = nil
// Pre-existing finalizer should be preserved.
existingMS.Finalizers = []string{"pre-existing-finalizer"}
existingMS.Spec.Template.Labels = map[string]string{
clusterv1.MachineDeploymentUniqueLabel: uniqueID,
"ms-label-2": "ms-value-2",
Expand All @@ -698,6 +705,8 @@ func TestComputeDesiredMachineSet(t *testing.T) {
expectedMS.UID = existingMSUID
expectedMS.Name = deployment.Name + "-" + uniqueID
expectedMS.Labels[clusterv1.MachineDeploymentUniqueLabel] = uniqueID
// Pre-existing finalizer should be preserved.
expectedMS.Finalizers = []string{"pre-existing-finalizer"}
expectedMS.Spec.Template.Labels[clusterv1.MachineDeploymentUniqueLabel] = uniqueID

g := NewWithT(t)
Expand Down Expand Up @@ -765,6 +774,9 @@ func assertMachineSet(g *WithT, actualMS *clusterv1.MachineSet, expectedMS *clus
// Check Namespace
g.Expect(actualMS.Namespace).Should(Equal(expectedMS.Namespace))

// Check finalizers
g.Expect(actualMS.Finalizers).Should(Equal(expectedMS.Finalizers))

// Check Replicas
g.Expect(actualMS.Spec.Replicas).ShouldNot(BeNil())
g.Expect(actualMS.Spec.Replicas).Should(HaveValue(Equal(*expectedMS.Spec.Replicas)))
Expand Down
11 changes: 10 additions & 1 deletion internal/controllers/machineset/machineset_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/labels"
kerrors "k8s.io/apimachinery/pkg/util/errors"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/apiserver/pkg/storage/names"
"k8s.io/client-go/tools/record"
Expand Down Expand Up @@ -637,10 +638,18 @@ func (r *Reconciler) computeDesiredMachine(machineSet *clusterv1.MachineSet, exi
if existingMachine != nil {
desiredMachine.SetName(existingMachine.Name)
desiredMachine.SetUID(existingMachine.UID)

// Preserve all existing finalizers (including foregroundDeletion finalizer).
finalizers := existingMachine.Finalizers
// Ensure MachineFinalizer is set.
if !sets.New[string](finalizers...).Has(clusterv1.MachineFinalizer) {
finalizers = append(finalizers, clusterv1.MachineFinalizer)
}
desiredMachine.Finalizers = finalizers

desiredMachine.Spec.Bootstrap.ConfigRef = existingMachine.Spec.Bootstrap.ConfigRef
desiredMachine.Spec.InfrastructureRef = existingMachine.Spec.InfrastructureRef
}

// Set the in-place mutable fields.
// When we create a new Machine we will just create the Machine with those fields.
// When we update an existing Machine will we update the fields on the existing Machine (in-place mutate).
Expand Down
4 changes: 4 additions & 0 deletions internal/controllers/machineset/machineset_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1620,6 +1620,8 @@ func TestComputeDesiredMachine(t *testing.T) {
existingMachine.UID = "abc-123-existing-machine-1"
existingMachine.Labels = nil
existingMachine.Annotations = nil
// Pre-existing finalizer should be preserved.
existingMachine.Finalizers = []string{"pre-existing-finalizer"}
existingMachine.Spec.InfrastructureRef = corev1.ObjectReference{
Kind: "GenericInfrastructureMachine",
Name: "infra-machine-1",
Expand All @@ -1637,6 +1639,8 @@ func TestComputeDesiredMachine(t *testing.T) {
expectedUpdatedMachine := skeletonMachine.DeepCopy()
expectedUpdatedMachine.Name = existingMachine.Name
expectedUpdatedMachine.UID = existingMachine.UID
// Pre-existing finalizer should be preserved.
expectedUpdatedMachine.Finalizers = []string{"pre-existing-finalizer", clusterv1.MachineFinalizer}
expectedUpdatedMachine.Spec.InfrastructureRef = *existingMachine.Spec.InfrastructureRef.DeepCopy()
expectedUpdatedMachine.Spec.Bootstrap.ConfigRef = existingMachine.Spec.Bootstrap.ConfigRef.DeepCopy()

Expand Down
12 changes: 12 additions & 0 deletions test/e2e/md_scale.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/utils/ptr"

"sigs.k8s.io/cluster-api/test/framework"
Expand Down Expand Up @@ -123,6 +124,17 @@ func MachineDeploymentScaleSpec(ctx context.Context, inputGetter func() MachineD
Replicas: 1,
WaitForMachineDeployments: input.E2EConfig.GetIntervals(specName, "wait-worker-nodes"),
})

By("Deleting the MachineDeployment with foreground deletion")
Copy link
Member Author

@sbueringer sbueringer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added this test to verify that foreground deletion works at the moment.

Long-term we would like to have the same behavior as with the Cluster (that MD is deleted "in the foreground" independent of if foreground or background deletion is used)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue to discuss "Implement forced foreground deletion"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would create one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref: #10710

foreground := metav1.DeletePropagationForeground
framework.DeleteAndWaitMachineDeployment(ctx, framework.DeleteAndWaitMachineDeploymentInput{
ClusterProxy: input.BootstrapClusterProxy,
Cluster: clusterResources.Cluster,
MachineDeployment: clusterResources.MachineDeployments[0],
WaitForMachineDeployments: input.E2EConfig.GetIntervals(specName, "wait-worker-nodes"),
DeletePropagationPolicy: &foreground,
})

By("PASSED!")
})

Expand Down
41 changes: 41 additions & 0 deletions test/framework/machinedeployment_helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/pkg/errors"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/util/sets"
Expand Down Expand Up @@ -448,6 +449,46 @@ func machineDeploymentOptions(deployment clusterv1.MachineDeployment) []client.L
}
}

// DeleteAndWaitMachineDeploymentInput is the input for DeleteAndWaitMachineDeployment.
type DeleteAndWaitMachineDeploymentInput struct {
ClusterProxy ClusterProxy
Cluster *clusterv1.Cluster
MachineDeployment *clusterv1.MachineDeployment
WaitForMachineDeployments []interface{}
DeletePropagationPolicy *metav1.DeletionPropagation
}

// DeleteAndWaitMachineDeployment deletes MachineDeployment and waits until it is gone.
func DeleteAndWaitMachineDeployment(ctx context.Context, input DeleteAndWaitMachineDeploymentInput) {
Expect(ctx).NotTo(BeNil(), "ctx is required for DeleteAndWaitMachineDeployment")
Expect(input.ClusterProxy).ToNot(BeNil(), "Invalid argument. input.ClusterProxy can't be nil when calling DeleteAndWaitMachineDeployment")
Expect(input.Cluster).ToNot(BeNil(), "Invalid argument. input.Cluster can't be nil when calling DeleteAndWaitMachineDeployment")
Expect(input.MachineDeployment).ToNot(BeNil(), "Invalid argument. input.MachineDeployment can't be nil when calling DeleteAndWaitMachineDeployment")

log.Logf("Deleting MachineDeployment %s", klog.KObj(input.MachineDeployment))
Expect(input.ClusterProxy.GetClient().Delete(ctx, input.MachineDeployment, &client.DeleteOptions{PropagationPolicy: input.DeletePropagationPolicy})).To(Succeed())

log.Logf("Waiting for MD to be deleted")
Eventually(func(g Gomega) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: should we also check that all the machines are gone? (so we check one layer more of the hierarchy)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

selectorMap, err := metav1.LabelSelectorAsMap(&input.MachineDeployment.Spec.Selector)
g.Expect(err).ToNot(HaveOccurred())

machines := &clusterv1.MachineList{}
err = input.ClusterProxy.GetClient().List(ctx, machines, client.InNamespace(input.Cluster.Namespace), client.MatchingLabels(selectorMap))
g.Expect(err).ToNot(HaveOccurred())
g.Expect(machines.Items).To(BeEmpty())

ms := &clusterv1.MachineSetList{}
err = input.ClusterProxy.GetClient().List(ctx, ms, client.InNamespace(input.Cluster.Namespace), client.MatchingLabels(selectorMap))
g.Expect(err).ToNot(HaveOccurred())
g.Expect(ms.Items).To(BeEmpty())

md := &clusterv1.MachineDeployment{}
err = input.ClusterProxy.GetClient().Get(ctx, client.ObjectKeyFromObject(input.MachineDeployment), md)
g.Expect(apierrors.IsNotFound(err)).To(BeTrue())
}, input.WaitForMachineDeployments...).Should(Succeed(), "Timed out waiting for Machine Deployment deletion")
}

// ScaleAndWaitMachineDeploymentInput is the input for ScaleAndWaitMachineDeployment.
type ScaleAndWaitMachineDeploymentInput struct {
ClusterProxy ClusterProxy
Expand Down
Loading