Cannot set --max-pods in the eks configuration #2551

insider89 · 2023-04-05T17:26:43Z

Description

Cannot override max-pods with latest 19.12 module. I've cluster provision with m2.large instance, which set 17 pods per node by default. I've set ENABLE_PREFIX_DELEGATION = "true" and WARM_PREFIX_TARGET = "1" for vpc-cni addons, but it doesn't help, still have 17 pods per node. In the Launch templates I see following:

/etc/eks/bootstrap.sh dev --kubelet-extra-args '--node-labels=node_group=infra,eks.amazonaws.com/nodegroup-image=ami-04dc8cdc2e948f054,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup=infra-20230316203627944100000001 --register-with-taints=infra=true:NoSchedule --max-pods=17' --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL --dns-cluster-ip $K8S_CLUSTER_DNS_IP --use-max-pods false

I tried to provide the following part to my managed group configuration, but module just ignore it:

      enable_bootstrap_user_data = true
      bootstrap_extra_args       = "--kubelet-extra-args '--max-pods=50'"

      pre_bootstrap_user_data = <<-EOT
        export USE_MAX_PODS=false
      EOT

⚠️ Note

Before you submit an issue, please perform the following first:

Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
Re-initialize the project root to pull down modules: terraform init
Re-attempt your terraform plan or apply and check if the issue still persists

Versions

Module version [Required]: 19.12
Terraform version:

Terraform v1.4.2

Provider version(s):

Terraform v1.4.2
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.61.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.3.2
+ provider registry.terraform.io/hashicorp/helm v2.9.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.19.0
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4

Reproduction Code [Required]

# https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2009
data "aws_eks_cluster" "default" {
  name = local.name
  depends_on = [
    module.eks.eks_managed_node_groups,
  ]
}

data "aws_eks_cluster_auth" "default" {
  name = local.name
  depends_on = [
    module.eks.eks_managed_node_groups,
  ]
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.default.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.default.token
}

data "aws_ami" "eks_default" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amazon-eks-node-${local.cluster_version}-v*"]
  }
}

data "aws_iam_roles" "sso_admins" {
  name_regex  = "AWSReservedSSO_AdministratorAccess_.*"
  path_prefix = "/aws-reserved/sso.amazonaws.com/eu-west-1/"
}

data "aws_iam_roles" "sso_developers" {
  name_regex  = "AWSReservedSSO_DeveloperAccess_.*"
  path_prefix = "/aws-reserved/sso.amazonaws.com/eu-west-1/"
}

locals {
  name            = "dev"
  cluster_version = "1.25"
  region          = "eu-west-1"

  vpc_cidr = data.terraform_remote_state.vpc.outputs.vpc_cidr_block
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  tags = {
    Environment = "dev"
    Team        = "DevOps"
    Terraform   = "true"
  }
}

data "aws_availability_zones" "available" {}
data "aws_caller_identity" "current" {}


################################################################################
# EKS Module
################################################################################

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.12"

  cluster_name                   = local.name
  cluster_version                = local.cluster_version
  cluster_endpoint_public_access = false

  cluster_addons = {
    coredns = {
      addon_version = "v1.9.3-eksbuild.2"

      timeouts = {
        create = "25m"
        delete = "10m"
      }
    }
    kube-proxy = {
      addon_version = "v1.25.6-eksbuild.2"
    }
    vpc-cni = {
      addon_version  = "v1.12.6-eksbuild.1"
      before_compute = true
      configuration_values = jsonencode({
        env = {
          # Reference docs https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
    aws-ebs-csi-driver = {
      addon_version            = "v1.17.0-eksbuild.1"
      service_account_role_arn = module.ebs_csi_irsa_role.iam_role_arn
    }
  }

  vpc_id                   = data.terraform_remote_state.vpc.outputs.vpc_id
  subnet_ids               = data.terraform_remote_state.vpc.outputs.private_subnets
  control_plane_subnet_ids = data.terraform_remote_state.vpc.outputs.intra_subnets

  # https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2009#issuecomment-1262099428
  cluster_security_group_additional_rules = {
    ingress = {
      description                = "EKS Cluster allows 443 port to get API call"
      type                       = "ingress"
      from_port                  = 443
      to_port                    = 443
      protocol                   = "TCP"
      cidr_blocks                = ["10.1.0.0/16"]
      source_node_security_group = false
    }
  }

  node_security_group_additional_rules = {
    node_to_node = {
      from_port = 0
      to_port   = 0
      protocol  = -1
      self      = true
      type      = "ingress"
    }
  }

  # EKS Managed Node Group(s)
  eks_managed_node_group_defaults = {
    attach_cluster_primary_security_group = true

    ami_type = "AL2_x86_64"

    instance_types = [
      "m5.large",
      "m5.xlarge",
      "m4.large",
      "m4.xlarge",
      "c3.large",
      "c3.xlarge",
      "t2.large",
      "t2.medium",
      "t2.xlarge",
      "t3.medium",
      "t3.large",
      "t3.xlarge"
    ]
    iam_role_additional_policies = {
      AmazonEC2ContainerRegistryReadOnly = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
    }
  }

  eks_managed_node_groups = {
    default = {
      description = "Default EKS managed node group"

      use_custom_launch_template = false

      remote_access = {
        ec2_ssh_key = data.terraform_remote_state.ssh_key.outputs.aws_key_pair_id
      }

      ami_id                     = data.aws_ami.eks_default.image_id
      enable_bootstrap_user_data = true
      bootstrap_extra_args       = "--kubelet-extra-args '--max-pods=50'"

      pre_bootstrap_user_data = <<-EOT
        export USE_MAX_PODS=false
      EOT

      min_size     = 1
      max_size     = 10
      desired_size = 1
      disk_size    = 20

      update_config = {
        max_unavailable_percentage = 33 # or set `max_unavailable`
      }

      labels = {
        node_group = "default"
      }
    }

    infra = {
      description                = "EKS managed node group for infra workloads"
      use_custom_launch_template = false

      remote_access = {
        ec2_ssh_key = data.terraform_remote_state.ssh_key.outputs.aws_key_pair_id
      }

      min_size     = 1
      max_size     = 10
      desired_size = 1
      disk_size    = 20

      update_config = {
        max_unavailable_percentage = 33 # or set `max_unavailable`
      }

      labels = {
        node_group = "infra"
      }

      taints = {
        dedicated = {
          key    = "infra"
          value  = "true"
          effect = "NO_SCHEDULE"
        }
      }
    }
  }

  # aws-auth configmap
  manage_aws_auth_configmap = true

  aws_auth_roles = [
    {
      rolearn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${one(data.aws_iam_roles.sso_admins.names)}"
      username = "sso-admin:{{SessionName}}"
      groups   = ["system:masters"]
    },
    {
      rolearn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${one(data.aws_iam_roles.sso_developers.names)}"
      username = "sso-developer:{{SessionName}}"
      groups   = ["system:masters"]
    },
  ]

  tags = local.tags
}

Expected behavior

Have 50 pods per node

Actual behavior

Have 17 pod per node

Additional context

I am going through different issue, but didn't find how to change the max-pod. This suggestion doesn't work.

The text was updated successfully, but these errors were encountered:

insider89 · 2023-04-06T07:39:23Z

When I delete a few instances from instance type and left instance with a bigger number of pods, I have limits of 29 pods per node now. But still cannot reach goal with 110 pods.
When I only left m2.large instance type, I have 110 pods per node, but it's not cause bootstrap_extra_args or any other configuration, it's set automatically, don't know why.

So the question still actually, how to set max-pod to 110.

Pionerd · 2023-05-01T08:09:56Z

It looks like I have the same issue over here. Any updates from your side @insider89 ?

insider89 · 2023-05-01T08:20:33Z

@Pionerd I didn't find a way how to set --max-pod in the eks terraform module. I figure out that if I provide different instance type in the instance_types , it set --max-pod to lowest number from the instance_types. So, first of all I left instance type with the same amount of CPU and Memory in the instance group(as cluster autoscaler cannot scale different instance type), and remove the instance type with lowest max pods by this list.

Pionerd · 2023-05-01T08:48:17Z

I hate to say this, but I recreated my environment from scratch and now my max_pods are 110...
I suspect it has to do with configuring the VPC CNI before creation of the node pools.

The following is sufficient, no need for bootstrap_extra_args

  cluster_addons = {
    vpc-cni = {
      most_recent = true  
      before_compute           = true
      configuration_values = jsonencode({
        env = {
          # Reference docs https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
  }

insider89 · 2023-05-01T09:13:44Z

@Pionerd I've this flag enabled as well for cni plugin, but still have max pod per node depends from the instance type I provide in the instance_types variable, in my case it's 20 pods per node(cause I've m4.large in the instance type), here is mu full configuration:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "19.13"

  cluster_name                   = local.name
  cluster_version                = local.cluster_version
  cluster_endpoint_public_access = false

  cluster_addons = {
    coredns = {
      addon_version = "v1.9.3-eksbuild.2"

      timeouts = {
        create = "25m"
        delete = "10m"
      }
    }
    kube-proxy = {
      addon_version = "v1.26.2-eksbuild.1"
    }
    vpc-cni = {
      addon_version  = "v1.12.6-eksbuild.1"
      before_compute = true
      configuration_values = jsonencode({
        env = {
          # Reference docs https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html
          ENABLE_PREFIX_DELEGATION = "true"
          WARM_PREFIX_TARGET       = "1"
        }
      })
    }
    aws-ebs-csi-driver = {
      addon_version            = "v1.17.0-eksbuild.1"
      service_account_role_arn = module.ebs_csi_irsa_role.iam_role_arn
    }
  }

  vpc_id                   = data.terraform_remote_state.vpc.outputs.vpc_id
  subnet_ids               = data.terraform_remote_state.vpc.outputs.private_subnets
  control_plane_subnet_ids = data.terraform_remote_state.vpc.outputs.intra_subnets

  # https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2009#issuecomment-1262099428
  cluster_security_group_additional_rules = {
    ingress = {
      description                = "EKS Cluster allows 443 port to get API call"
      type                       = "ingress"
      from_port                  = 443
      to_port                    = 443
      protocol                   = "TCP"
      cidr_blocks                = ["10.1.0.0/16"]
      source_node_security_group = false
    }
  }

  node_security_group_additional_rules = {
    node_to_node = {
      from_port = 0
      to_port   = 0
      protocol  = -1
      self      = true
      type      = "ingress"
    }
  }

  # EKS Managed Node Group(s)
  eks_managed_node_group_defaults = {
    attach_cluster_primary_security_group = true

    iam_role_additional_policies = {
      AmazonEC2ContainerRegistryReadOnly = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
    }
  }

  eks_managed_node_groups = {
    default = {
      description = "Default EKS managed node group"

      use_custom_launch_template = false

      remote_access = {
        ec2_ssh_key = data.terraform_remote_state.ssh_key.outputs.aws_key_pair_id
      }

      instance_types = [
        "m5.large",
        "t2.large",
        "t3.large",
        "m5d.large",
        "m5a.large",
        "m5ad.large",
        "m5n.large",
        "m5dn.large",
        "m4.large",
      ]

      min_size     = 1
      max_size     = 15
      desired_size = 1
      disk_size    = 20

      update_config = {
        max_unavailable_percentage = 33 # or set `max_unavailable`
      }

      labels = {
        node_group = "default"
      }
    }

    infra = {
      description                = "EKS managed node group for infra workloads"
      use_custom_launch_template = false

      instance_types = [
        "m5.large",
        "t2.large",
        "t3.large",
        "m5d.large",
        "m5a.large",
        "m5ad.large",
        "m5n.large",
        "m5dn.large",
        "m4.large"
      ]

      remote_access = {
        ec2_ssh_key = data.terraform_remote_state.ssh_key.outputs.aws_key_pair_id
      }

      min_size     = 1
      max_size     = 15
      desired_size = 1
      disk_size    = 20

      update_config = {
        max_unavailable_percentage = 33 # or set `max_unavailable`
      }

      labels = {
        node_group = "infra"
      }

      taints = {
        dedicated = {
          key    = "infra"
          value  = "true"
          effect = "NO_SCHEDULE"
        }
      }
    }
  }

  # aws-auth configmap
  manage_aws_auth_configmap = true

  aws_auth_roles = [
    {
      rolearn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${one(data.aws_iam_roles.sso_admins.names)}"
      username = "sso-admin:{{SessionName}}"
      groups   = ["system:masters"]
    },
    {
      rolearn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${one(data.aws_iam_roles.sso_developers.names)}"
      username = "sso-developer:{{SessionName}}"
      groups   = ["system:masters"]
    },
  ]
}

Pionerd · 2023-05-01T14:12:53Z

Hi @insider89

Just ran into the issue again with exactly the same code as before. Looks like some kind of timing issue still. What worked for me (this time, no guarantees) is leaving the cluster intact, removing the existing node group only and recreating it.

SrDayne · 2023-05-04T15:10:50Z

Hello guys.

For me it looks like problem not in terraform itself but in aws. Looks like amazon's bootstrap overrides provided values.
I use next workaround:

eks_managed_node_groups = {
    dev = {
      name = "k8s-dev"

      instance_types = ["t3.medium"]

      enable_bootstrap_user_data = false
      
      pre_bootstrap_user_data = <<-EOT
        #!/bin/bash
        LINE_NUMBER=$(grep -n "KUBELET_EXTRA_ARGS=\$2" /etc/eks/bootstrap.sh | cut -f1 -d:)
        REPLACEMENT="\ \ \ \ \ \ KUBELET_EXTRA_ARGS=\$(echo \$2 | sed -s -E 's/--max-pods=[0-9]+/--max-pods=30/g')"
        sed -i '/KUBELET_EXTRA_ARGS=\$2/d' /etc/eks/bootstrap.sh
        sed -i "$${LINE_NUMBER}i $${REPLACEMENT}" /etc/eks/bootstrap.sh
      EOT

      min_size = 1
      max_size = 3
      desired_size = 2

      #taints = [
      #  {
      #    key = "node.cilium.io/agent-not-ready"
      #    value = "true"
      #    effect = "NoExecute"
      #  }
      #]
    }
  }

It is not elegant solution, but it works. It replaces on the fly line in bootstrap script which responsible for --kubelet-extra-args. Notice, that if you use custom ami_id setup could be a little bit different, but still, it should work.

As a result:
kubectl describe node ip-10-1-0-102.eu-south-1.compute.internal

Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         2
  ephemeral-storage:           20959212Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      3943372Ki
  pods:                        30
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         1930m
  ephemeral-storage:           18242267924
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      3388364Ki
  pods:                        30

Edit
Tried node autoscaling, tried multiple times recreate environments - script works.

bryantbiggs · 2023-05-17T13:23:32Z

Yes - managed nodegroups own the boostrap script in the user data which leads to hacky work-arounds How to enable containerd when using EKS managed node group awslabs/amazon-eks-ami#844

The proper way to enable max pods is by setting the intended values via the VPC CNI custom configuration. If the VPC CNI is configured before nodegroups are created and nodes launched, EKS managed nodegroups will infer from the VPC CNI configuration the proper value for max pods. There is a flag that should be enabled to ensure the VPC CNI can be created before the associated nodegroups

terraform-aws-eks/examples/eks_managed_node_group/main.tf

Line 66 in 0f9d9fa

before_compute = true

which has a default timeout of 30s that can be increased if necessary

terraform-aws-eks/node_groups.tf

Lines 22 to 39 in 0f9d9fa

    
           # This sleep resource is used to provide a timed gap between the cluster creation and the downstream dependencies 
        
           # that consume the outputs from here. Any of the values that are used as triggers can be used in dependencies 
        
           # to ensure that the downstream resources are created after both the cluster is ready and the sleep time has passed. 
        
           # This was primarily added to give addons that need to be configured BEFORE data plane compute resources 
        
           # enough time to create and configure themselves before the data plane compute resources are created. 
        
           resource "time_sleep" "this" { 
        
             count = var.create ? 1 : 0 
        
             create_duration = var.dataplane_wait_duration 
        
             triggers = { 
        
               cluster_name     = aws_eks_cluster.this[0].name 
        
               cluster_endpoint = aws_eks_cluster.this[0].endpoint 
        
               cluster_version  = aws_eks_cluster.this[0].version 
        
               cluster_certificate_authority_data = aws_eks_cluster.this[0].certificate_authority[0].data 
        
             } 
        
           }

For now though, closing out since there are no further actions (that I am aware of) that the module can take to improve upon this area

CostinaDamir · 2023-06-15T08:15:01Z

pre_bootstrap_user_data = <<-EOT
#!/bin/bash
LINE_NUMBER=$(grep -n "KUBELET_EXTRA_ARGS=$2" /etc/eks/bootstrap.sh | cut -f1 -d:)
REPLACEMENT="\ \ \ \ \ \ KUBELET_EXTRA_ARGS=$(echo $2 | sed -s -E 's/--max-pods=[0-9]+/--max-pods=30/g')"
sed -i '/KUBELET_EXTRA_ARGS=$2/d' /etc/eks/bootstrap.sh
sed -i "$${LINE_NUMBER}i $${REPLACEMENT}" /etc/eks/bootstrap.sh
EOT

I tried your workarround, but I get tf error:
│ Error: Variables not allowed │ │ on <value for var.eks_managed_node_groups> line 1: │ (source code not available)

Any idea?

ophintor · 2023-06-15T08:17:39Z

I think you need to escape all the $?

SrDayne · 2023-06-15T08:48:24Z

@CostinaDamir
As @ophintor said, you need to escape multiple $. Copy piece of code without any changes:

      pre_bootstrap_user_data = <<-EOT
        #!/bin/bash
        LINE_NUMBER=$(grep -n "KUBELET_EXTRA_ARGS=\$2" /etc/eks/bootstrap.sh | cut -f1 -d:)
        REPLACEMENT="\ \ \ \ \ \ KUBELET_EXTRA_ARGS=\$(echo \$2 | sed -s -E 's/--max-pods=[0-9]+/--max-pods=30/g')"
        sed -i '/KUBELET_EXTRA_ARGS=\$2/d' /etc/eks/bootstrap.sh
        sed -i "$${LINE_NUMBER}i $${REPLACEMENT}" /etc/eks/bootstrap.sh
      EOT

Also, you can replace --max-pods=30 with --max-pods=${var.cluster_max_pods} and set amount of pods with variable.

github-actions · 2023-07-16T02:27:55Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

bryantbiggs added the question label May 17, 2023

bryantbiggs closed this as completed May 17, 2023

github-actions bot locked as resolved and limited conversation to collaborators Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot set --max-pods in the eks configuration #2551

Cannot set --max-pods in the eks configuration #2551

insider89 commented Apr 5, 2023

insider89 commented Apr 6, 2023

Pionerd commented May 1, 2023

insider89 commented May 1, 2023

Pionerd commented May 1, 2023

insider89 commented May 1, 2023

Pionerd commented May 1, 2023

SrDayne commented May 4, 2023 •

edited

Loading

bryantbiggs commented May 17, 2023

CostinaDamir commented Jun 15, 2023

ophintor commented Jun 15, 2023

SrDayne commented Jun 15, 2023

github-actions bot commented Jul 16, 2023

Cannot set --max-pods in the eks configuration #2551

Cannot set --max-pods in the eks configuration #2551

Comments

insider89 commented Apr 5, 2023

Description

⚠️ Note

Versions

Reproduction Code [Required]

Expected behavior

Actual behavior

Additional context

insider89 commented Apr 6, 2023

Pionerd commented May 1, 2023

insider89 commented May 1, 2023

Pionerd commented May 1, 2023

insider89 commented May 1, 2023

Pionerd commented May 1, 2023

SrDayne commented May 4, 2023 • edited Loading

bryantbiggs commented May 17, 2023

CostinaDamir commented Jun 15, 2023

ophintor commented Jun 15, 2023

SrDayne commented Jun 15, 2023

github-actions bot commented Jul 16, 2023

SrDayne commented May 4, 2023 •

edited

Loading