Skip to content

Commit 3fdf4a5

Browse files
authored
test(multicloud): Add EKS module, demo stack and tests (#1390)
# Description * Create EKS module * Create EKS example * Create EKS unit and integration test with retina * Create live/retina-eks to demo multi-cloud * Update docs * Update diagrams * Update Makefile for this sub-project test/multicloud ## Related Issue #1267 ## Checklist - [x] I have read the [contributing documentation](https://retina.sh/docs/Contributing/overview). - [x] I signed and signed-off the commits (`git commit -S -s ...`). See [this documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification) on signing commits. - [x] I have correctly attributed the author(s) of the code. - [x] I have tested the changes locally. - [x] I have followed the project's style guidelines. - [x] I have updated the documentation, if necessary. - [x] I have added tests, if applicable. ## Screenshots (if applicable) or Testing Completed Grafana Hubble DNS dashboard for EKS cluster ![Screenshot_26-2-2025_141028_srodi grafana net](https://github.com/user-attachments/assets/d5e43699-83f9-429f-b7df-127a6e238859) EKS cluster showing AWS nodes and retina logs ![Screenshot 2025-02-26 131742](https://github.com/user-attachments/assets/2bb9ec2c-7b13-40af-b10e-607e02467ffa) ## Additional Notes Add any additional notes or context about the pull request here. --- Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more information on how to contribute to this project.
1 parent 6883d41 commit 3fdf4a5

34 files changed

+2991
-555
lines changed

test/multicloud/Makefile

+12
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,15 @@ apply:
1111
cd live/$(STACK_NAME) && \
1212
tofu apply --auto-approve
1313

14+
check-env-vars:
15+
@if [ -z "$(GRAFANA_AUTH)" ]; then echo "GRAFANA_AUTH is not set"; exit 1; fi
16+
@if [ -z "$(STACK_NAME)" ]; then echo "STACK_NAME is not set"; exit 1; fi
17+
@if [ "$(STACK_NAME)" = "retina-gke" ] && [ -z "$(GOOGLE_APPLICATION_CREDENTIALS)" ]; then echo "GOOGLE_APPLICATION_CREDENTIALS is not set"; exit 1; fi
18+
@if [ "$(STACK_NAME)" = "retina-eks" ] && [ -z "$(AWS_SECRET_ACCESS_KEY)" ]; then echo "AWS_SECRET_ACCESS_KEY is not set"; exit 1; fi
19+
@if [ "$(STACK_NAME)" = "retina-eks" ] && [ -z "$(AWS_ACCESS_KEY_ID)" ]; then echo "AWS_ACCESS_KEY_ID is not set"; exit 1; fi
20+
1421
quick:
22+
@make check-env-vars
1523
@make plan
1624
@make apply
1725

@@ -23,6 +31,10 @@ aks: export STACK_NAME=$(PREFIX)-aks
2331
aks:
2432
@make quick
2533

34+
eks: export STACK_NAME=$(PREFIX)-eks
35+
eks:
36+
@make quick
37+
2638
kind: export STACK_NAME=$(PREFIX)-kind
2739
kind:
2840
@make quick

test/multicloud/README.md

+30-3
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ An example Hubble UI visualization on GKE dataplane v1 (no Cilium). [See GKE net
1010

1111
* [aks](./modules/aks/): Deploy Azure Kubernetes Service cluster.
1212
* [gke](./modules/gke/): Deploy Google Kubernetes Engine cluster.
13+
* [eks](./modules/eks/): Deploy Elastic Kubernetes Service cluster.
1314
* [kind](./modules/kind/): Deploy KIND cluster.
1415
* [helm-release](./modules/helm-release/): Deploy a Helm Chart, used to deploy Retina and Prometheus.
1516
* [kubernetes-lb](./modules/kubernetes-lb/): Create a Kubernetes Service of type Load Balancer, used to expose Prometheus.
@@ -48,6 +49,19 @@ An example Hubble UI visualization on GKE dataplane v1 (no Cilium). [See GKE net
4849
export GOOGLE_APPLICATION_CREDENTIALS=/Users/srodi/src/retina/test/multicloud/live/retina-gke/service-key.json
4950
```
5051

52+
* EKS:
53+
1. Create an AWS account
54+
2. Create a user and assign required policies to create VPC, Subnets, Security Groups, IAM roles, EKS and workers
55+
3. [Install AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
56+
4. Create required `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` for the new user
57+
58+
To deploy an EKS cluster export `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` as env variables.
59+
60+
```sh
61+
export AWS_ACCESS_KEY_ID="..."
62+
export AWS_SECRET_ACCESS_KEY="..."
63+
```
64+
5165
* Grafana
5266

5367
1. Set up a [Grafana Cloud free account](https://grafana.com/pricing/) and start an instance.
@@ -85,6 +99,12 @@ Format code, initialize OpenTofu, plan and apply the stack to create infra and d
8599
make gke
86100
```
87101

102+
* EKS:
103+
104+
```sh
105+
make eks
106+
```
107+
88108
* Kind:
89109

90110
```sh
@@ -93,13 +113,13 @@ Format code, initialize OpenTofu, plan and apply the stack to create infra and d
93113

94114
### Clean up
95115

96-
To destroy the cluster specify the `STACK_NAME` and run `make clean`.
116+
To destroy the cluster specify the `STACK_NAME` and run `make destroy`.
97117

98118
```sh
99119
# destroy AKS and cleanup local state files
100120
# set a different stack as needed (i.e. retina-gke, retina-kind)
101121
export STACK_NAME=retina-aks
102-
make clean
122+
make destroy
103123
```
104124

105125
### Test
@@ -116,6 +136,7 @@ Resources documentation:
116136

117137
* [GKE](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster)
118138
* [AKS](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster)
139+
* [EKS](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_cluster)
119140
* [Kind](https://registry.terraform.io/providers/tehcyx/kind/latest/docs/resources/cluster)
120141
* [Helm Release](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release)
121142
* [Kubernetes LB Service](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/service)
@@ -132,12 +153,18 @@ Here is an example on how to import resources for `modules/gke`:
132153
# i.e. examples/gke
133154
tofu import module.gke.google_container_cluster.gke europe-west2/test-gke-cluster
134155
tofu import module.gke.google_service_account.default projects/mc-retina/serviceAccounts/test-gke-service-account@mc-retina.iam.gserviceaccount.com
156+
157+
# i.e. examples/eks
158+
tofu import module.eks.aws_eks_node_group.node_group mc-test-aks:mc-test-node-group
159+
tofu import module.eks.aws_iam_role.eks_node_group_role mc-test-eks-node-group-role
160+
tofu import module.eks.aws_iam_role_policy_attachment.eks_node_group_AmazonEKS_CNI_Policy "mc-test-eks-node-group-role/arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
161+
tofu import module.eks.aws_iam_role_policy_attachment.eks_node_group_AmazonEKSWorkerNodePolicy "mc-test-eks-node-group-role/arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
135162
```
136163

137164
>Note: each resource documentation contains a section on how to import resources into the State. [Example for google_container_cluster resource](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster#import).
138165

139166
## Multi-Cloud
140167

141-
The [live/](./live/) directory contains the multi-cloud / multi-cluster stacks to deploy clusters, install Retina, install Prometheus, expose all Prometheus using load blanaces, and configure a Grafana Cloud instance to consume prometheus data sources to visualize multiple cluster in a single Grafana dashboard.
168+
The [live/](./live/) directory contains multi-cloud / multi-cluster stacks to deploy cloud infrastructure, install Retina, install Prometheus, expose Prometheus instance using a load balancer, and configure a Grafana Cloud instance to consume Prometheus data sources to visualize Retina metrics from multiple clusters in a single Grafana dashboard.
142169

143170
![Architecture Diagram](./diagrams/diagram-mc.svg)

0 commit comments

Comments
 (0)