Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes upgrade of worker_groups and worker_groups_launch_template #986

Closed
1 of 4 tasks
hpio opened this issue Aug 24, 2020 · 4 comments
Closed
1 of 4 tasks

Kubernetes upgrade of worker_groups and worker_groups_launch_template #986

hpio opened this issue Aug 24, 2020 · 4 comments

Comments

@hpio
Copy link

hpio commented Aug 24, 2020

I have issues

I'm submitting a...

  • bug report
  • feature request
  • support request - read the FAQ first!
  • kudos, thank you, warm fuzzy

What is the current behavior?

I have only recently started using this module with worker_groups and worker_groups_launch_template but cannot find (apologies if I missed it somehow) procedure for Kubernetes upgrades. Setting cluster_version upgrades Kubernetes master but all nodes in my cluster remain on previous version.

My cluster is created with the below block

module "eks" {
  source          = "../../modules/terraform-aws-eks"
  cluster_name    = local.cluster_name
  subnets         = module.example.private_subnets
  vpc_id          = module.example.vpc_id
  cluster_version = "1.17"

  cluster_endpoint_private_access      = true
  cluster_endpoint_public_access_cidrs = ["REDACTED"]

  enable_irsa = true

  map_roles = local.map_roles

  worker_groups = [
    {
      name                = "on-demand-1"
      instance_type       = "m5.large"
      asg_max_size        = 10
      kubelet_extra_args  = "--node-labels=spot=false"
      suspended_processes = ["AZRebalance"]
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${local.cluster_name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        }
      ]
    }
  ]
  worker_groups_launch_template = [
    {
      name                    = "spot-1"
      override_instance_types = ["m5.large", "m5a.large", "m5d.large", "m5ad.large"]
      asg_desired_capacity    = 2
      asg_max_size            = 10
      kubelet_extra_args      = "--node-labels=node.kubernetes.io/lifecycle=spot"
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${local.cluster_name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        }
      ]
    },
  ]
}

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

All nodes in the cluster are on the same version as the master

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

  • Affected module version: v12.2.0
  • OS:
  • Terraform version:
❯ terraform version
Terraform v0.12.29
+ provider.aws v2.70.0

Any other relevant info

@wolstena
Copy link

I believe this will work:

worker_groups                         = [{
  version = 1.16
}]

@dpiddockcmp
Copy link
Contributor

The module, by default, will not recreate the worker nodes when modifying any of their attributes, like cluster version. Kubernetes is a complicated beast and does not take well to all the nodes being deleted at once. This will likely cause interruptions to your workloads.

Suggested processes for safely recreating the nodes can be found in the FAQ

#937 contains an example of how to automate the draining of nodes.

@hpio
Copy link
Author

hpio commented Sep 3, 2020

Thanks @dpiddockcmp , exactly what I was after! Coming from GKE thought the process would be handled automatically by AWS hence my confussion.

@hpio hpio closed this as completed Sep 3, 2020
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants