Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Medusa Restore fails to startup the Cassandra pods #1495

Open
vigneshkumarak opened this issue Feb 25, 2025 · 0 comments
Open

Medusa Restore fails to startup the Cassandra pods #1495

vigneshkumarak opened this issue Feb 25, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@vigneshkumarak
Copy link

vigneshkumarak commented Feb 25, 2025

What happened?

Medusa Restorejob fails bring back the Cassandra pods

Did you expect to see something different?

Restore job should bring back the cassandra pods with restored data

How to reproduce it (as minimally and precisely as possible):

Apply the restorejob yaml , It will delete the existing and pods and fails to bring that back due to reconcile error

apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaRestoreJob
metadata:
  name: medusa-restore-11
  namespace: cass-test
spec:
  backup: medusa-backup-schedule-1740470400
  cassandraDatacenter: dc1
5m13s       Normal    StoppingDatacenter       cassandradatacenter/dc1            Stopping datacenter
4m53s       Normal    Killing                  pod/main-dc1-default-sts-2         Stopping container medusa
4m53s       Normal    Killing                  pod/main-dc1-default-sts-1         Stopping container cassandra
4m53s       Normal    Killing                  pod/main-dc1-default-sts-0         Stopping container medusa
4m53s       Normal    Killing                  pod/main-dc1-default-sts-2         Stopping container cassandra
4m53s       Normal    Killing                  pod/main-dc1-default-sts-0         Stopping container server-system-logger
4m53s       Normal    Killing                  pod/main-dc1-default-sts-0         Stopping container cassandra
4m53s       Normal    Killing                  pod/main-dc1-default-sts-1         Stopping container server-system-logger
4m53s       Warning   ReconcileFailed          cassandradatacenter/dc1            Operation cannot be fulfilled on statefulsets.apps "main-dc1-default-sts": the object has been modified; please apply your changes to the latest version and try again
4m53s       Normal    SuccessfulDelete         statefulset/main-dc1-default-sts   delete Pod main-dc1-default-sts-2 in StatefulSet main-dc1-default-sts successful
4m53s       Normal    SuccessfulDelete         statefulset/main-dc1-default-sts   delete Pod main-dc1-default-sts-1 in StatefulSet main-dc1-default-sts successful
4m53s       Normal    SuccessfulDelete         statefulset/main-dc1-default-sts   delete Pod main-dc1-default-sts-0 in StatefulSet main-dc1-default-sts successful
2m52s       Normal    NoPods                   poddisruptionbudget/dc1-pdb        No matching pods found
2m52s       Normal    UpdatingRack             cassandradatacenter/dc1            Updating rack default
2m32s       Warning   Reconcile Error          k8ssandracluster/main              Operation cannot be fulfilled on k8ssandraclusters.k8ssandra.io "main": the object has been modified; please apply your changes to the latest version and try again

Environment

  • K8ssandra Operator version:

    cr.k8ssandra.io/k8ssandra/k8ssandra-operator:v1.21.1

  • Kubernetes version information:

    Client Version: v1.31.1 Kustomize Version: v5.4.2 Server Version: v1.30.9-eks-8cce635

  • Kubernetes cluster kind:

    EKS Cluster

  • Manifests:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: main
  namespace: cass-test
spec:
  medusa:
    cassandraUserSecretRef:
      name: main-superuser
    storageProperties:
      credentialsType: role-based
      storageProvider: s3
      bucketName: casaandra-backups
      prefix: demo
      region: us-east-1
      secure: false
  cassandra:
    serverVersion: 5.0.1
    datacenters:
      - metadata:
          name: dc1
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: gp2
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 5Gi
        config:
          jvmOptions:
            heapSize: 512M
  • K8ssandra Operator Logs:
2025-02-25T12:49:22.546Z	DEBUG	events	Operation cannot be fulfilled on k8ssandraclusters.k8ssandra.io "main": the object has been modified; please apply your changes to the latest version and try again	{"type": "Warning", "object": {"kind":"K8ssandraCluster","namespace":"cass-test","name":"main","uid":"a60f9c9c-f88c-4c95-b3f7-a92fd7e53b97","apiVersion":"k8ssandra.io/v1alpha1","resourceVersion":"144351913"}, "reason": "Reconcile Error"}
2025-02-25T12:49:22.561Z	INFO	updated k8ssandracluster status	{"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": {"name":"main","namespace":"cass-test"}, "namespace": "cass-test", "name": "main", "reconcileID": "40373aeb-0ddf-4ef4-9d2f-6bdc9d2ab43a", "K8ssandraCluster": {"name":"main","namespace":"cass-test"}}
2025-02-25T12:49:22.561Z	ERROR	Reconciler error	{"controller": "k8ssandracluster", "controllerGroup": "k8ssandra.io", "controllerKind": "K8ssandraCluster", "K8ssandraCluster": {"name":"main","namespace":"cass-test"}, "namespace": "cass-test", "name": "main", "reconcileID": "40373aeb-0ddf-4ef4-9d2f-6bdc9d2ab43a", "error": "Operation cannot be fulfilled on k8ssandraclusters.k8ssandra.io \"main\": the object has been modified; please apply your changes to the latest version and try again"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:227

Anything else we need to know?:

Clogs from cass-operator

2025-02-25T12:47:01.246Z	INFO	Updating statefulset pod specs	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"cass-test"}, "namespace": "cass-test", "name": "dc1", "reconcileID": "ce14c2df-9c49-4c54-b0f4-4a5930cc07ea", "namespace": "cass-test", "datacenterName": "dc1", "clusterName": "main", "statefulSet": {"namespace": "cass-test", "name": "main-dc1-default-sts"}}
2025-02-25T12:47:01.275Z	ERROR	controllers.CassandraDatacenter	calculateReconciliationActions returned an error	{"cassandradatacenter": {"name":"dc1","namespace":"cass-test"}, "requestNamespace": "cass-test", "requestName": "dc1", "loopID": "cd869389-6aa4-4197-b3a6-1ea83f332fc2", "error": "Operation cannot be fulfilled on statefulsets.apps \"main-dc1-default-sts\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/k8ssandra/cass-operator/internal/controllers/cassandra.(*CassandraDatacenterReconciler).Reconcile
	/workspace/internal/controllers/cassandra/cassandradatacenter_controller.go:149
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.4/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.4/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.4/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.4/pkg/internal/controller/controller.go:227
2025-02-25T12:47:01.275Z	INFO	Operation cannot be fulfilled on statefulsets.apps "main-dc1-default-sts": the object has been modified; please apply your changes to the latest version and try again	{"controller": "cassandradatacenter_controller", "controllerGroup": "cassandra.datastax.com", "controllerKind": "CassandraDatacenter", "CassandraDatacenter": {"name":"dc1","namespace":"cass-test"}, "namespace": "cass-test", "name": "dc1", "reconcileID": "ce14c2df-9c49-4c54-b0f4-4a5930cc07ea", "reason": "ReconcileFailed", "eventType": "Warning"}

When we check the status
K8ssandraCluster Stopped : false
cassandradatacenter Stopped : true

manually patch the cassandradatacenter to true will bring back the pods with restored data

kubectl patch cassandradatacenter dc1 -n cass-test --type merge --patch '{"spec":{"stopped":false}}'

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: K8OP-316

@vigneshkumarak vigneshkumarak added the bug Something isn't working label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant