Skip to content

Commit 0ec500b

Browse files
feat(terraform): add support for using existing GKE clusters (#20)
* feat(terraform): add support for using existing GKE clusters Introduce a new variable `use_existing_cluster` to toggle between creating a new GKE cluster or using an existing one. Update resources and outputs to conditionally reference existing cluster data when enabled. Add a migration script and upgrade guide for transitioning from version 0.3.x to 0.4.x, ensuring a smooth upgrade process. Enhance variable descriptions for clarity and improve documentation. * terraform-docs: automated action * feat(gke): add conditional logic for existing cluster usage Introduce `use_existing_cluster` variable to conditionally create resources, allowing flexibility in using existing clusters. Update firewall rules and node pool configurations to respect this setting. Enhance variable descriptions with defaults for clarity and improve resource tagging and filtering capabilities. * terraform-docs: automated action * fix(gke.tf): remove conditional cluster assignment to ensure consistent cluster ID usage fix(gke.tf): adjust firewall rule count logic to account for shared VPC with existing cluster * docs(variables.tf): update section header for clarity in cluster configuration * docs(upgrade-guide): update guide for 0.4.x migration and remove obsolete script Remove placeholder migration script for 0.4.x as it is no longer needed. Update upgrade guide to include new `use_existing_cluster` variable and instructions for using `terraform state mv` for resource migration. * chore(scripts): remove unused migration-script-4.sh file to clean up the codebase * docs(upgrade-guide.md): correct terminology from 'diff' to 'drift' for clarity refactor(variables.tf): reorder variables for better logical grouping and readability * terraform-docs: automated action --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 7b19f3d commit 0ec500b

File tree

6 files changed

+122
-53
lines changed

6 files changed

+122
-53
lines changed

README.md

+7-5
Original file line numberDiff line numberDiff line change
@@ -29,16 +29,17 @@ No modules.
2929
| [google_compute_firewall.fix_webhooks](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource |
3030
| [google_container_node_pool.control_plane_pool](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool) | resource |
3131
| [google_container_node_pool.generic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool) | resource |
32+
| [google_container_cluster.existing_cluster](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/container_cluster) | data source |
3233

3334
## Inputs
3435

3536
| Name | Description | Type | Default | Required |
3637
|------|-------------|------|---------|:--------:|
3738
| <a name="input_allowed_ip_ranges"></a> [allowed\_ip\_ranges](#input\_allowed\_ip\_ranges) | Allowed IP ranges to connect to master | `list(string)` | <pre>[<br/> "0.0.0.0/0"<br/>]</pre> | no |
38-
| <a name="input_cluster_generic_node_config"></a> [cluster\_generic\_node\_config](#input\_cluster\_generic\_node\_config) | Cluster Generic Node configuration | <pre>object({<br/> disk_size_gb = optional(string, "100")<br/> disk_type = optional(string, "pd-balanced")<br/> machine_type = optional(string, "e2-medium")<br/> enable_secure_boot = optional(bool, true)<br/> enable_integrity_monitoring = optional(bool, true)<br/> auto_repair = optional(bool, true)<br/> auto_upgrade = optional(bool, true)<br/> node_count = optional(number, 1)<br/> workload_metadata_config_mode = optional(string, "GKE_METADATA")<br/> service_account = optional(string, "default")<br/> preemptible = optional(bool, false)<br/> spot = optional(bool, true)<br/> })</pre> | `{}` | no |
39+
| <a name="input_cluster_generic_node_config"></a> [cluster\_generic\_node\_config](#input\_cluster\_generic\_node\_config) | Configuration for the generic node pool. This includes:<br/>- disk\_size\_gb: Size of the disk attached to each node (default: "100")<br/>- disk\_type: Type of disk attached to each node (pd-standard, pd-balanced, pd-ssd) (default: "pd-balanced")<br/>- machine\_type: The name of a Google Compute Engine machine type (default: "e2-medium")<br/>- enable\_secure\_boot: Secure Boot helps ensure that the system only runs authentic software (default: true)<br/>- enable\_integrity\_monitoring: Enables monitoring and attestation of the boot integrity (default: true)<br/>- auto\_repair: Flag to enable auto repair for the nodes (default: true)<br/>- auto\_upgrade: Flag to enable auto upgrade for the nodes (default: true)<br/>- node\_count: The number of nodes per instance group (default: 1)<br/>- workload\_metadata\_config\_mode: How to expose metadata to workloads running on the node (default: "GKE\_METADATA")<br/>- service\_account: The Google Cloud Platform Service Account (default: "default")<br/>- preemptible: Flag to enable preemptible nodes (default: false)<br/>- spot: Flag to enable spot instances (default: true) | <pre>object({<br/> disk_size_gb = optional(string, "100")<br/> disk_type = optional(string, "pd-balanced")<br/> machine_type = optional(string, "e2-medium")<br/> enable_secure_boot = optional(bool, true)<br/> enable_integrity_monitoring = optional(bool, true)<br/> auto_repair = optional(bool, true)<br/> auto_upgrade = optional(bool, true)<br/> node_count = optional(number, 1)<br/> workload_metadata_config_mode = optional(string, "GKE_METADATA")<br/> service_account = optional(string, "default")<br/> preemptible = optional(bool, false)<br/> spot = optional(bool, true)<br/> })</pre> | `{}` | no |
3940
| <a name="input_cluster_master_ipv4_cidr_block"></a> [cluster\_master\_ipv4\_cidr\_block](#input\_cluster\_master\_ipv4\_cidr\_block) | Master nodes ipv4 cidr | `string` | n/a | yes |
40-
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | Name of the cluster | `string` | n/a | yes |
41-
| <a name="input_cluster_nap_node_config"></a> [cluster\_nap\_node\_config](#input\_cluster\_nap\_node\_config) | Cluster NAP Node configuration | <pre>object({<br/> disk_size_gb = optional(string, "300")<br/> disk_type = optional(string, "pd-balanced")<br/> enable_secure_boot = optional(bool, true)<br/> enable_integrity_monitoring = optional(bool, true)<br/> autoscaling_profile = optional(string, "OPTIMIZE_UTILIZATION")<br/> max_cpu = optional(number, 1024)<br/> max_memory = optional(number, 8172)<br/> auto_repair = optional(bool, true)<br/> auto_upgrade = optional(bool, true)<br/> max_surge = optional(number, 1)<br/> max_unavailable = optional(number, 0)<br/> })</pre> | `{}` | no |
41+
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | Name of the cluster. If use\_existing\_cluster is enabled cluster\_name is used to fetch details of existing cluster | `string` | n/a | yes |
42+
| <a name="input_cluster_nap_node_config"></a> [cluster\_nap\_node\_config](#input\_cluster\_nap\_node\_config) | Configuration for the NAP node pool. This includes:<br/>- disk\_size\_gb: Size of the disk attached to each node (default: "300")<br/>- disk\_type: Type of disk attached to each node (pd-standard, pd-balanced, pd-ssd) (default: "pd-balanced")<br/>- enable\_secure\_boot: Secure Boot helps ensure that the system only runs authentic software (default: true)<br/>- enable\_integrity\_monitoring: Enables monitoring and attestation of the boot integrity (default: true)<br/>- autoscaling\_profile: Profile for autoscaling optimization (default: "OPTIMIZE\_UTILIZATION")<br/>- max\_cpu: Maximum CPU cores allowed per node (default: 1024)<br/>- max\_memory: Maximum memory in MB allowed per node (default: 8172)<br/>- auto\_repair: Flag to enable auto repair for the nodes (default: true)<br/>- auto\_upgrade: Flag to enable auto upgrade for the nodes (default: true)<br/>- max\_surge: Maximum number of nodes that can be created beyond the current size during updates (default: 1)<br/>- max\_unavailable: Maximum number of nodes that can be unavailable during updates (default: 0) | <pre>object({<br/> disk_size_gb = optional(string, "300")<br/> disk_type = optional(string, "pd-balanced")<br/> enable_secure_boot = optional(bool, true)<br/> enable_integrity_monitoring = optional(bool, true)<br/> autoscaling_profile = optional(string, "OPTIMIZE_UTILIZATION")<br/> max_cpu = optional(number, 1024)<br/> max_memory = optional(number, 8172)<br/> auto_repair = optional(bool, true)<br/> auto_upgrade = optional(bool, true)<br/> max_surge = optional(number, 1)<br/> max_unavailable = optional(number, 0)<br/> })</pre> | `{}` | no |
4243
| <a name="input_cluster_network_id"></a> [cluster\_network\_id](#input\_cluster\_network\_id) | Network ID for the cluster | `string` | n/a | yes |
4344
| <a name="input_cluster_networking_mode"></a> [cluster\_networking\_mode](#input\_cluster\_networking\_mode) | Networking mode for the cluster. Values can be VPC\_NATIVE (recommended) or ROUTES. VPC\_NATIVE is default after google-beta 5.0.0 | `string` | `"VPC_NATIVE"` | no |
4445
| <a name="input_cluster_node_locations"></a> [cluster\_node\_locations](#input\_cluster\_node\_locations) | AZ for nodes - this should match the region | `list(string)` | n/a | yes |
@@ -49,14 +50,15 @@ No modules.
4950
| <a name="input_deletion_protection"></a> [deletion\_protection](#input\_deletion\_protection) | Deletion protection enabled/disabled | `bool` | `false` | no |
5051
| <a name="input_enable_container_image_streaming"></a> [enable\_container\_image\_streaming](#input\_enable\_container\_image\_streaming) | Enable/disable container image streaming | `bool` | `true` | no |
5152
| <a name="input_kubernetes_version"></a> [kubernetes\_version](#input\_kubernetes\_version) | Version of GKE | `string` | `"1.28"` | no |
52-
| <a name="input_max_pods_per_node"></a> [max\_pods\_per\_node](#input\_max\_pods\_per\_node) | Maximum pods per node | `string` | `"32"` | no |
53+
| <a name="input_max_pods_per_node"></a> [max\_pods\_per\_node](#input\_max\_pods\_per\_node) | Maximum number of pods per node in this cluster. | `string` | `"32"` | no |
5354
| <a name="input_network_tags"></a> [network\_tags](#input\_network\_tags) | A list of network tags to add to all instances | `list(string)` | `[]` | no |
5455
| <a name="input_oauth_scopes"></a> [oauth\_scopes](#input\_oauth\_scopes) | Oauth Scopes to attach to the cluste | `list(string)` | <pre>[<br/> "https://www.googleapis.com/auth/cloud-platform",<br/> "https://www.googleapis.com/auth/devstorage.read_only",<br/> "https://www.googleapis.com/auth/logging.write",<br/> "https://www.googleapis.com/auth/monitoring",<br/> "https://www.googleapis.com/auth/service.management.readonly",<br/> "https://www.googleapis.com/auth/servicecontrol",<br/> "https://www.googleapis.com/auth/trace.append"<br/>]</pre> | no |
5556
| <a name="input_project"></a> [project](#input\_project) | GCP Project | `string` | n/a | yes |
5657
| <a name="input_region"></a> [region](#input\_region) | region | `string` | n/a | yes |
5758
| <a name="input_services_secondary_range_name"></a> [services\_secondary\_range\_name](#input\_services\_secondary\_range\_name) | VPC Secondary range name for services | `string` | `""` | no |
5859
| <a name="input_shared_vpc"></a> [shared\_vpc](#input\_shared\_vpc) | Flag to enable shared VPC | `bool` | `false` | no |
59-
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources | `map(string)` | `{}` | no |
60+
| <a name="input_tags"></a> [tags](#input\_tags) | A map of tags to add to all resources. Tags are key-value pairs used for grouping and filtering | `map(string)` | `{}` | no |
61+
| <a name="input_use_existing_cluster"></a> [use\_existing\_cluster](#input\_use\_existing\_cluster) | Flag to enable the use of an existing GKE cluster or create a new one | `bool` | `false` | no |
6062

6163
## Outputs
6264

data.tf

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
data "google_container_cluster" "existing_cluster" {
2+
count = var.use_existing_cluster ? 1 : 0
3+
name = var.cluster_name
4+
location = var.region
5+
project = var.project
6+
}

gke.tf

+7-6
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# See: https://registry.terraform.io/providers/hashicorp/google-beta/latest/docs/resources/container_cluster
22
resource "google_container_cluster" "cluster" {
3+
count = var.use_existing_cluster ? 0 : 1
34
provider = google-beta
45
project = var.project
56
name = var.cluster_name
@@ -167,8 +168,9 @@ resource "google_container_cluster" "cluster" {
167168
## Generic node pool
168169
##########################################################################################
169170
resource "google_container_node_pool" "generic" {
171+
count = var.use_existing_cluster ? 0 : 1
170172
name = "generic"
171-
cluster = google_container_cluster.cluster.id
173+
cluster = google_container_cluster.cluster[0].id
172174
location = var.region
173175
project = var.project
174176
node_locations = var.cluster_node_locations
@@ -209,7 +211,7 @@ resource "google_container_node_pool" "generic" {
209211
resource "google_container_node_pool" "control_plane_pool" {
210212
count = var.control_plane_enabled ? 1 : 0
211213
name = "control-plane"
212-
cluster = google_container_cluster.cluster.id
214+
cluster = google_container_cluster.cluster[0].id
213215
project = var.project
214216
location = var.region
215217
node_locations = var.cluster_node_locations
@@ -260,21 +262,20 @@ resource "google_container_node_pool" "control_plane_pool" {
260262
# *****************************************/
261263
resource "google_compute_firewall" "fix_webhooks" {
262264
# count = var.add_cluster_firewall_rules || var.add_master_webhook_firewall_rules ? 1 : 0
263-
count = var.shared_vpc ? 0 : 1
265+
count = var.use_existing_cluster && var.shared_vpc ? 0 : 1
264266
name = "${var.cluster_name}-webhook"
265267
description = "Allow Nodes access to Control Plane"
266268
project = var.project
267269
network = var.cluster_network_id
268270
priority = 1000
269271
direction = "INGRESS"
270-
271272
source_ranges = [
272-
"${google_container_cluster.cluster.endpoint}/32",
273+
"${google_container_cluster.cluster[0].endpoint}/32",
273274
var.cluster_master_ipv4_cidr_block
274275
]
275276

276277
allow {
277278
protocol = "tcp"
278279
ports = ["443", "8443", "9443", "15017"]
279280
}
280-
}
281+
}

output.tf

+5-5
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,22 @@
55

66
output "cluster_endpoint" {
77
description = "Endpoint for your Kubernetes API server"
8-
value = google_container_cluster.cluster.endpoint
8+
value = var.use_existing_cluster ? data.google_container_cluster.existing_cluster[0].endpoint : google_container_cluster.cluster[0].endpoint
99
}
1010

1111
output "cluster_id" {
1212
description = "The id of the GKE cluster"
13-
value = google_container_cluster.cluster.id
13+
value = var.use_existing_cluster ? data.google_container_cluster.existing_cluster[0].id : google_container_cluster.cluster[0].id
1414
}
1515

1616
output "cluster_name" {
1717
description = "The name of the GKE cluster"
18-
value = element(split("/", google_container_cluster.cluster.id), length(split("/", google_container_cluster.cluster.id)) - 1)
18+
value = var.use_existing_cluster ? data.google_container_cluster.existing_cluster[0].name : google_container_cluster.cluster[0].name
1919
}
2020

2121
output "cluster_master_version" {
2222
description = "Master version for the cluster"
23-
value = google_container_cluster.cluster.master_version
23+
value = var.use_existing_cluster ? data.google_container_cluster.existing_cluster[0].master_version : google_container_cluster.cluster[0].master_version
2424
}
2525

2626
output "cluster_secondary_range_name" {
@@ -31,4 +31,4 @@ output "cluster_secondary_range_name" {
3131
output "services_secondary_range_name" {
3232
description = "Cluster secondry range name for service IPs"
3333
value = var.services_secondary_range_name
34-
}
34+
}

upgrade-guide.md

+25
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# terraform-google-truefoundry-cluster-autopilot
2+
3+
This guide will help you to migrate your terraform code across versions. Keeping your terraform state to the latest version is always recommeneded
4+
5+
## Upgrade guide from 0.3.x to 0.4.x
6+
7+
Changes, we introduced `use_existing_cluster` variable, which allows you to use an existing cluster.
8+
Few Modules are shifted to using count block to support this feature.
9+
10+
1. Ensure that you are running on the latest version of 0.3.x
11+
2. Move to `0.4.0` and run the following command
12+
13+
```bash
14+
terraform init -upgrade
15+
16+
terraform state mv 'google_container_cluster.cluster' 'google_container_cluster.cluster[0]'
17+
terraform state mv 'google_container_node_pool.generic' 'google_container_node_pool.generic[0]'
18+
terraform state mv 'google_container_node_pool.control_plane_pool' 'google_container_node_pool.control_plane_pool[0]' # If control plane is enabled, else skip this step
19+
```
20+
21+
3. Run terraform plan to check if there is any drift
22+
23+
```bash
24+
terraform plan
25+
```

0 commit comments

Comments
 (0)