Skip to content

Commit 876d277

Browse files
authored
Merge pull request #86 from truefoundry/docs_website
Added config yml for docs website
2 parents bdbe1bd + 0d00d1c commit 876d277

11 files changed

+129
-166
lines changed

_config.yml

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
theme: jekyll-theme-midnight

docs/architecture.md

+26-109
Original file line numberDiff line numberDiff line change
@@ -1,131 +1,48 @@
1-
---
2-
title: Elasti Architecture
3-
---
1+
# Elasti Architecture
42

5-
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
6-
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
7-
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*
83

9-
- [Elasti Project Documentation](#elasti-project-documentation)
10-
- [1. Introduction](#1-introduction)
11-
- [Overview](#overview)
12-
- [Key Components](#key-components)
13-
- [2. Architecture](#2-architecture)
14-
- [Flow Description](#flow-description)
15-
- [3. Controller](#3-controller)
16-
- [4. Resolver](#4-resolver)
17-
- [5. Helm Values](#5-helm-values)
4+
Elasti comprises of two main components: operator and resolver.
185

19-
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
6+
- **Controller**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scaled them to 0 or 1 as needed.
7+
- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-controller to scale up the target service.
208

21-
# Elasti Project Documentation
9+
## Flow Description
2210

23-
## 1. Introduction
11+
When we enable Elasti on a service, the service operates in 3 modes:
2412

25-
### Overview
26-
The Elasti project is designed to enable serverless capability for Kubernetes services by dynamically scaling services based on incoming requests. It comprises two main components: operator and resolver. The elasti-operator manages the scaling of target services, while the resolver intercepts and queues requests when the target service is scaled down to zero replicas.
27-
28-
### Key Components
29-
- **Operator**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scales target services as needed.
30-
- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-operator to scale up the target service.
13+
1. **Steady State**: The service is receiving traffic and doesn't need to be scaled down to 0.
14+
2. **Scale Down to 0**: The service hasn't received any traffic for the configured duration and can be scaled down to 0.
15+
3. **Scale up from 0**: The service receives traffic again and can be scaled up to the configured minTargetReplicas.
3116

3217
<div align="center">
33-
<img src="./assets/components.png" width="500px">
18+
<img src="./assets/architecture/flow.png" width="1000px">
3419
</div>
3520

21+
### Steady state flow of requests to service
22+
23+
In this mode, all the requests are handled directly by the service pods. The Elasti resolved doesn't come into the picture. Elasti controller keeps polling prometheus with the configured query and check the result with threshold value to see if the service can be scaled down.
3624

37-
## 2. Architecture
3825
<div align="center">
39-
<img src="./assets/hld.png" width="1000px">
26+
<img src="./assets/architecture/1.png" width="1000px">
4027
</div>
4128

42-
### Flow Description
43-
44-
- **[CRD Created]** The Operator fetches details from the CRD.
45-
1. Adds a finalizer to the CRD, ensuring it is only deleted by the Operator for proper cleanup.
46-
2. Fetches the `ScaleTargetRef` and initiates a watch on it.
47-
3. Adds the CRD details to a `crdDirectory`, caching the details of all CRDs.
48-
- **[ScaleTargetRef Watch]** When a watch is added to the `ScaleTargetRef`:
49-
1. Identifies the kind of target and checks the available ready pods.
50-
2. If `replicas == 0` -> Switches to **Proxy Mode**.
51-
3. If `replicas > 0` -> Switches to **Serve Mode**.
52-
4. Currently, it supports only `deployments` and `rollouts`.
53-
54-
- **When pods scale to 0**
55-
56-
- **[Switch to Proxy Mode]**
57-
1. Creates a Private Service for the target service. This allows the resolver to reach the target pod, even when the public service has been modified, as described in the following steps.
58-
2. Creates a watch on the public service to monitor changes in ports or selectors.
59-
3. Creates a new `EndpointSlice` for the public service to redirect any traffic to the resolver.
60-
4. Creates a watch on the resolver to monitor the addition of new pods.
61-
62-
- **[In Proxy Mode]**
63-
1. Traffic reaching the target service, which has no pods, is sent to the resolver, capable of handling requests on all endpoints.
64-
2. [**In Resolver**]
65-
1. Once traffic hits the resolver, it reaches the `handleAnyRequest` handler.
66-
2. The host is extracted from the request. If it's a known host, the cache is retrieved from `hostManager`. If not, the service name is extracted from the host and saved in `hostManager`.
67-
3. The service name is used to identify the private service.
68-
4. Using `operatorRPC`, the controller is informed about the incoming request.
69-
5. The request is sent to the `throttler`, which queues the requests. It checks if the pods for the private service are up.
70-
1. If yes, a proxy request is made, and the response is sent back.
71-
2. If no, the request is re-enqueued, and the check is retried after a configurable time interval (set in the Helm values file).
72-
6. If the request is successful, traffic for this host is disabled temporarily (configurable). This prevents new incoming requests to the resolver, as the target is now verified to be up.
73-
3. [**In Controller/Operator**]
74-
1. ElastiServer processes requests from the resolver, containing the service experiencing traffic.
75-
2. Matches the service with the `crdDirectory` entry to retrieve the `ScaleTargetRef`, which is then used to scale the target.
76-
3. Evaluates triggers defined in the ElastiService:
77-
- If **any** trigger indicates that the service should be scaled up -> Scales to minTargetReplicas
78-
4. Once scaled up, switches to **Serve Mode**
79-
80-
- **When pods scale to 1**
81-
82-
- **[Switch to Serve Mode]**
83-
1. The Operator stops the informer/watch on the resolver.
84-
2. The Operator deletes the `EndpointSlice` pointing to the resolver.
85-
3. The system switches to **Serve Mode**.
86-
- **[In Serve Mode]**
87-
1. Traffic hits the gateway, is routed to the target service, then to the target pod, and resolves the request.
88-
2. The Operator periodically evaluates triggers defined in the ElastiService.
89-
3. If **all** triggers indicate that the service is to be scaled down and cooldownPeriod has elapsed since last scale-up:
90-
- Scales down the target service to zero replicas
91-
- Switches to **Proxy Mode**
92-
93-
94-
## 3. Controller
29+
### Scale down to 0 when there are no requests
30+
31+
If the query from prometheus returns a value less than the threshold, Elasti will scale down the service to 0. Before it scales to 0, it redirects the requests to be forwarded to the Elasti resolver and then modified the Rollout/deployment to have 0 replicas. It also then pauses Keda (if Keda is being used) to prevent it from scaling the service up since Keda is configured with minReplicas as 1.
9532

9633
<div align="center">
97-
<img src="./assets/lld-operator.png" width="1000px">
34+
<img src="./assets/architecture/2.png" width="1000px">
9835
</div>
9936

100-
## 4. Resolver
37+
### Scale up from 0 when the first request arrives.
38+
39+
Since the service is scaled down to 0, all requests will hit the Elasti resolver. When the first request arrives, Elasti will scale up the service to the configured minTargetReplicas. It then resumes Keda to continue autoscaling in case there is a sudden burst of requests. It also changes the service to point to the actual service pods once the pod is up. The requests which came to ElastiResolver are retried till 5 mins and the response is sent back to the client. If the pod takes more than 5 mins to come up, the request is dropped.
10140

10241
<div align="center">
103-
<img src="./assets/lld-resolver.png" width="800px">
42+
<img src="./assets/architecture/3.png" width="1000px">
10443
</div>
10544

106-
## 5. Helm Values
107-
108-
Values you can pass to elastiResolver env.
109-
```yaml
110-
111-
# HeaderForHost is the header to look for to get the host. X-Envoy-Decorator-Operation is the key for istio
112-
headerForHost: X-Envoy-Decorator-Operation
113-
# InitialCapacity is the initial capacity of the semaphore
114-
initialCapacity: "500"
115-
maxIdleProxyConns: "100"
116-
maxIdleProxyConnsPerHost: "500"
117-
# MaxQueueConcurrency is the maximum number of concurrent requests
118-
maxQueueConcurrency: "100"
119-
# OperatorRetryDuration is the duration for which we don't inform the operator
120-
# about the traffic on the same host
121-
operatorRetryDuration: "10"
122-
# QueueRetryDuration is the duration after we retry the requests in queue
123-
queueRetryDuration: "3"
124-
# QueueSize is the size of the queue
125-
queueSize: "50000"
126-
# ReqTimeout is the timeout for each request
127-
reqTimeout: "120"
128-
# TrafficReEnableDuration is the duration for which the traffic is disabled for a host
129-
# This is also duration for which we don't recheck readiness of the service
130-
trafficReEnableDuration: "5"
131-
```
45+
46+
<div align="center">
47+
<img src="./assets/architecture/4.png" width="1000px">
48+
</div>

docs/assets/architecture/1.png

647 KB
Loading

docs/assets/architecture/2.png

536 KB
Loading

docs/assets/architecture/3.png

653 KB
Loading

docs/assets/architecture/4.png

622 KB
Loading

docs/assets/architecture/flow.png

373 KB
Loading

docs/configure-elastiservice.md

+81
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Configure ElastiService
2+
3+
To enable scale to 0 on any deployment, we will need to create an ElastiService custom resource for that deployment.
4+
5+
A ElastiService custom resource has the following structure:
6+
7+
```yaml
8+
apiVersion: elasti.truefoundry.com/v1alpha1
9+
kind: ElastiService
10+
metadata:
11+
name: <service-name>
12+
namespace: <service-namespace>
13+
spec:
14+
minTargetReplicas: <min-target-replicas>
15+
service: <service-name>
16+
cooldownPeriod: <cooldown-period>
17+
scaleTargetRef:
18+
apiVersion: <apiVersion>
19+
kind: <kind>
20+
name: <deployment-or-rollout-name>
21+
triggers:
22+
- type: <trigger-type>
23+
metadata:
24+
<trigger-metadata>
25+
autoscaler:
26+
name: <autoscaler-object-name>
27+
type: <autoscaler-type>
28+
```
29+
The key fields to be specified in the spec are:
30+
31+
- `<service-name>`: Replace it with the service you want managed by elasti.
32+
- `<service-namespace>`: Replace by namespace of the service.
33+
- `<min-target-replicas>`: Min replicas to bring up when first request arrives.
34+
- `<scaleTargetRef>`: Reference to the scale target similar to the one used in HorizontalPodAutoscaler.
35+
- `<kind>`: Replace by `rollouts` or `deployments`
36+
- `<apiVersion>`: Replace with `argoproj.io/v1alpha1` or `apps/v1`
37+
- `<deployment-or-rollout-name>`: Replace with name of the rollout or the deployment for the service. This will be scaled up to min-target-replicas when first request comes
38+
- `cooldownPeriod`: Minimum time (in seconds) to wait after scaling up before considering scale down
39+
- `triggers`: List of conditions that determine when to scale down (currently supports only Prometheus metrics)
40+
- `autoscaler`: **Optional** integration with an external autoscaler (HPA/KEDA) if needed
41+
- `<autoscaler-type>`: keda
42+
- `<autoscaler-object-name>`: Name of the KEDA ScaledObject
43+
44+
45+
## Configuration Explanation
46+
47+
The section below explains how are the different configuration options used in Elasti.
48+
49+
### Which service to apply elasti on
50+
51+
This is defined using the `scaleTargetRef` field in the spec.
52+
53+
- `scaleTargetRef.kind`: should be either be `deployments` or `rollouts` (in case you are using Argo Rollouts).
54+
- `scaleTargetRef.apiVersion` will be `apps/v1` if you are using deployments or `argoproj.io/v1alpha1` in case you are using argo-rollouts.
55+
- `scaleTargetRef.name` should exactly match the name of the deployment or rollout.
56+
57+
### When to scale down the service to 0
58+
59+
This is defined uing the triggers field in the spec. Currently, Elasti supports only one trigger type - `prometheus`. The metadata field of the trigger defines the trigger data. The `query` field is the prometheus query to use for the trigger. The `serverAddress` field is the address of the prometheus server. The `threshold` field is the threshold value to use for the trigger. So we can define a query to check for the number of requests per second and the threshold to be 0. Elasti will check this metric every 30 seconds and if the values is less than 0(`threshold`) it will scale down the service to 0.
60+
61+
An example trigger is as follows:
62+
63+
```
64+
triggers:
65+
- type: prometheus
66+
metadata:
67+
query: sum(rate(nginx_ingress_controller_nginx_process_requests_total[1m])) or vector(0)
68+
serverAddress: http://kube-prometheus-stack-prometheus.monitoring.svc.cluster.local:9090
69+
threshold: 0.5
70+
```
71+
72+
Once the service is scaled down to 0, we also need to pause the current autoscaler to make sure it doesn't scale up the service again. While this is not a problem with HPA, Keda will scale up the service again since the min replicas is 1. Hence Elasti needs to know about the Keda scaled object so that it can pause it. This information is provided in the `autoscaler` field of the ElastiService. The autoscaler type supported as of now is only keda.
73+
74+
- autoscaler.name: Name of the keda scaled object
75+
- autoscaler.type: keda
76+
77+
### When to scale up the service to 1
78+
79+
As soon as the service is scaled down to 0, Elasti resolved will start accepting requests for that service. On receiving the first request, it will scale up the service to `minTargetReplicas`. Once the pod is up, the new requests are handled by the service pods and do not pass through the elasti-resolver. The requests that came before the pod scaled up are held in memory of the elasti-resolver and are processed once the pod is up.
80+
81+
We can configure the `cooldownPeriod` to specify the minimum time (in seconds) to wait after scaling up before considering scale down.

docs/getting-started.md

+5-41
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Getting Started
22

3-
With Elasti, you can easily manage and scale your Kubernetes services by using a proxy mechanism that queues and holds requests for scaled-down services, bringing them up only when needed. Get started by following below steps:
3+
Get started by following below steps:
44

55
## Prerequisites
66

@@ -12,12 +12,13 @@ With Elasti, you can easily manage and scale your Kubernetes services by using a
1212

1313
### 1. Install Elasti using helm
1414

15-
Use Helm to install elasti into your Kubernetes cluster. Replace `<release-name>` with your desired release name and `<namespace>` with the Kubernetes namespace you want to use:
15+
Use Helm to install elasti into your Kubernetes cluster.
1616

1717
```bash
1818
helm install elasti oci://tfy.jfrog.io/tfy-helm/elasti --namespace elasti --create-namespace
1919
```
20-
Check out [values.yaml](./charts/elasti/values.yaml) to see config in the helm value file.
20+
21+
Check out [values.yaml](https://github.com/truefoundry/elasti/blob/main/charts/elasti/values.yaml) to see config in the helm value file.
2122

2223
### 2. Verify the Installation
2324

@@ -78,43 +79,6 @@ This will deploy a httpbin service in the `elasti-demo` namespace.
7879
### 6. Define an ElastiService
7980

8081
To configure a service to handle its traffic via elasti, you'll need to create and apply a `ElastiService` custom resource:
81-
```yaml
82-
apiVersion: elasti.truefoundry.com/v1alpha1
83-
kind: ElastiService
84-
metadata:
85-
name: <service-name>-elasti
86-
namespace: <service-namespace>
87-
spec:
88-
minTargetReplicas: 1
89-
service: <service-name>
90-
cooldownPeriod: 300
91-
scaleTargetRef:
92-
apiVersion: <api-version>
93-
kind: <kind>
94-
name: <deployment-or-rollout-name>
95-
triggers:
96-
- type: <trigger-type>
97-
metadata:
98-
query: <prometheus-query>
99-
serverAddress: <prometheus-server-address>
100-
threshold: <threshold>
101-
autoscaler:
102-
type: <autoscaler-type>
103-
name: <autoscaler-object-name>
104-
```
105-
106-
- `<service-name>`: Replace it with the service you want managed by elasti.
107-
- `<min-target-replicas>`: Min replicas to bring up when first request arrives.
108-
- `<service-namespace>`: Replace by namespace of the service.
109-
- `<scaleTargetRef>`: Reference to the scale target similar to the one used in HorizontalPodAutoscaler.
110-
- `<kind>`: Replace by `rollouts` or `deployments`
111-
- `<apiVersion>`: Replace with `argoproj.io/v1alpha1` or `apps/v1`
112-
- `<deployment-or-rollout-name>`: Replace with name of the rollout or the deployment for the service. This will be scaled up to min-target-replicas when first request comes
113-
- `cooldownPeriod`: Minimum time (in seconds) to wait after scaling up before considering scale down
114-
- `triggers`: List of conditions that determine when to scale down (currently supports only Prometheus metrics)
115-
- `autoscaler`: **Optional** integration with an external autoscaler (HPA/KEDA) if needed
116-
- `<autoscaler-type>`: hpa/keda
117-
- `<autoscaler-object-name>`: name of the KEDA ScaledObject or HPA HorizontalPodAutoscaler object
11882

11983
Create a file named `httpbin-elasti.yaml` and apply the configuration.
12084
```yaml
@@ -170,7 +134,7 @@ curl -v http://localhost:8080/httpbin
170134
```
171135

172136
You should see the pods being created and scaled up to 1 replica. A response from the httpbin service should be visible for the curl command.
173-
The service should be scaled down to 0 replicas if there is no traffic for `cooldownPeriod` seconds.
137+
The service should be scaled down to 0 replicas if there is no traffic for 5 (`cooldownPeriod` in elastiService) seconds.
174138

175139
## Uninstall
176140

docs/index.md

+9-6
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,17 @@
1-
---
2-
layout: default
3-
---
4-
51
# Elasti
2+
Enable Scale to 0 on Kubernetes while using HPA or Keda.
63

74

8-
- [Introduction to Elasti](introduction.md)
5+
- [Introduction](introduction.md)
96
- [Getting Started](getting-started.md)
10-
- [Monitoring Elasti](monitoring.md)
7+
- [Configure ElastiService](configure-elastiservice.md)
118
- [Architecture](architecture.md)
9+
- [Monitoring Elasti](monitoring.md)
1210
- [Integrations](integrations.md)
11+
- [HPA](integrations.md#hpa)
12+
- [Keda](integrations.md#keda)
1313
- [Comparisons](comparisons.md)
14+
- [Knative](./comparisons.md#)
15+
- [OpenFaas](./comparisons.md#openfaas)
16+
- [Keda Http Add-on](./comparisons.md#keda-http-add-on)
1417
- [Development](../DEVELOPMENT.md)

0 commit comments

Comments
 (0)