|
1 |
| ---- |
2 |
| -title: Elasti Architecture |
3 |
| ---- |
| 1 | +# Elasti Architecture |
4 | 2 |
|
5 |
| -<!-- START doctoc generated TOC please keep comment here to allow auto update --> |
6 |
| -<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> |
7 |
| -**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* |
8 | 3 |
|
9 |
| -- [Elasti Project Documentation](#elasti-project-documentation) |
10 |
| - - [1. Introduction](#1-introduction) |
11 |
| - - [Overview](#overview) |
12 |
| - - [Key Components](#key-components) |
13 |
| - - [2. Architecture](#2-architecture) |
14 |
| - - [Flow Description](#flow-description) |
15 |
| - - [3. Controller](#3-controller) |
16 |
| - - [4. Resolver](#4-resolver) |
17 |
| - - [5. Helm Values](#5-helm-values) |
| 4 | +Elasti comprises of two main components: operator and resolver. |
18 | 5 |
|
19 |
| -<!-- END doctoc generated TOC please keep comment here to allow auto update --> |
| 6 | +- **Controller**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scaled them to 0 or 1 as needed. |
| 7 | +- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-controller to scale up the target service. |
20 | 8 |
|
21 |
| -# Elasti Project Documentation |
| 9 | +## Flow Description |
22 | 10 |
|
23 |
| -## 1. Introduction |
| 11 | +When we enable Elasti on a service, the service operates in 3 modes: |
24 | 12 |
|
25 |
| -### Overview |
26 |
| -The Elasti project is designed to enable serverless capability for Kubernetes services by dynamically scaling services based on incoming requests. It comprises two main components: operator and resolver. The elasti-operator manages the scaling of target services, while the resolver intercepts and queues requests when the target service is scaled down to zero replicas. |
27 |
| - |
28 |
| -### Key Components |
29 |
| -- **Operator**: A Kubernetes controller built using kubebuilder. It monitors ElastiService resources and scales target services as needed. |
30 |
| -- **Resolver**: A service that intercepts incoming requests for scaled-down services, queues them, and notifies the elasti-operator to scale up the target service. |
| 13 | +1. **Steady State**: The service is receiving traffic and doesn't need to be scaled down to 0. |
| 14 | +2. **Scale Down to 0**: The service hasn't received any traffic for the configured duration and can be scaled down to 0. |
| 15 | +3. **Scale up from 0**: The service receives traffic again and can be scaled up to the configured minTargetReplicas. |
31 | 16 |
|
32 | 17 | <div align="center">
|
33 |
| -<img src="./assets/components.png" width="500px"> |
| 18 | +<img src="./assets/architecture/flow.png" width="1000px"> |
34 | 19 | </div>
|
35 | 20 |
|
| 21 | +### Steady state flow of requests to service |
| 22 | + |
| 23 | +In this mode, all the requests are handled directly by the service pods. The Elasti resolved doesn't come into the picture. Elasti controller keeps polling prometheus with the configured query and check the result with threshold value to see if the service can be scaled down. |
36 | 24 |
|
37 |
| -## 2. Architecture |
38 | 25 | <div align="center">
|
39 |
| -<img src="./assets/hld.png" width="1000px"> |
| 26 | +<img src="./assets/architecture/1.png" width="1000px"> |
40 | 27 | </div>
|
41 | 28 |
|
42 |
| -### Flow Description |
43 |
| - |
44 |
| -- **[CRD Created]** The Operator fetches details from the CRD. |
45 |
| - 1. Adds a finalizer to the CRD, ensuring it is only deleted by the Operator for proper cleanup. |
46 |
| - 2. Fetches the `ScaleTargetRef` and initiates a watch on it. |
47 |
| - 3. Adds the CRD details to a `crdDirectory`, caching the details of all CRDs. |
48 |
| -- **[ScaleTargetRef Watch]** When a watch is added to the `ScaleTargetRef`: |
49 |
| - 1. Identifies the kind of target and checks the available ready pods. |
50 |
| - 2. If `replicas == 0` -> Switches to **Proxy Mode**. |
51 |
| - 3. If `replicas > 0` -> Switches to **Serve Mode**. |
52 |
| - 4. Currently, it supports only `deployments` and `rollouts`. |
53 |
| - |
54 |
| -- **When pods scale to 0** |
55 |
| - |
56 |
| -- **[Switch to Proxy Mode]** |
57 |
| - 1. Creates a Private Service for the target service. This allows the resolver to reach the target pod, even when the public service has been modified, as described in the following steps. |
58 |
| - 2. Creates a watch on the public service to monitor changes in ports or selectors. |
59 |
| - 3. Creates a new `EndpointSlice` for the public service to redirect any traffic to the resolver. |
60 |
| - 4. Creates a watch on the resolver to monitor the addition of new pods. |
61 |
| - |
62 |
| -- **[In Proxy Mode]** |
63 |
| - 1. Traffic reaching the target service, which has no pods, is sent to the resolver, capable of handling requests on all endpoints. |
64 |
| - 2. [**In Resolver**] |
65 |
| - 1. Once traffic hits the resolver, it reaches the `handleAnyRequest` handler. |
66 |
| - 2. The host is extracted from the request. If it's a known host, the cache is retrieved from `hostManager`. If not, the service name is extracted from the host and saved in `hostManager`. |
67 |
| - 3. The service name is used to identify the private service. |
68 |
| - 4. Using `operatorRPC`, the controller is informed about the incoming request. |
69 |
| - 5. The request is sent to the `throttler`, which queues the requests. It checks if the pods for the private service are up. |
70 |
| - 1. If yes, a proxy request is made, and the response is sent back. |
71 |
| - 2. If no, the request is re-enqueued, and the check is retried after a configurable time interval (set in the Helm values file). |
72 |
| - 6. If the request is successful, traffic for this host is disabled temporarily (configurable). This prevents new incoming requests to the resolver, as the target is now verified to be up. |
73 |
| - 3. [**In Controller/Operator**] |
74 |
| - 1. ElastiServer processes requests from the resolver, containing the service experiencing traffic. |
75 |
| - 2. Matches the service with the `crdDirectory` entry to retrieve the `ScaleTargetRef`, which is then used to scale the target. |
76 |
| - 3. Evaluates triggers defined in the ElastiService: |
77 |
| - - If **any** trigger indicates that the service should be scaled up -> Scales to minTargetReplicas |
78 |
| - 4. Once scaled up, switches to **Serve Mode** |
79 |
| - |
80 |
| -- **When pods scale to 1** |
81 |
| - |
82 |
| -- **[Switch to Serve Mode]** |
83 |
| - 1. The Operator stops the informer/watch on the resolver. |
84 |
| - 2. The Operator deletes the `EndpointSlice` pointing to the resolver. |
85 |
| - 3. The system switches to **Serve Mode**. |
86 |
| -- **[In Serve Mode]** |
87 |
| - 1. Traffic hits the gateway, is routed to the target service, then to the target pod, and resolves the request. |
88 |
| - 2. The Operator periodically evaluates triggers defined in the ElastiService. |
89 |
| - 3. If **all** triggers indicate that the service is to be scaled down and cooldownPeriod has elapsed since last scale-up: |
90 |
| - - Scales down the target service to zero replicas |
91 |
| - - Switches to **Proxy Mode** |
92 |
| - |
93 |
| - |
94 |
| -## 3. Controller |
| 29 | +### Scale down to 0 when there are no requests |
| 30 | + |
| 31 | +If the query from prometheus returns a value less than the threshold, Elasti will scale down the service to 0. Before it scales to 0, it redirects the requests to be forwarded to the Elasti resolver and then modified the Rollout/deployment to have 0 replicas. It also then pauses Keda (if Keda is being used) to prevent it from scaling the service up since Keda is configured with minReplicas as 1. |
95 | 32 |
|
96 | 33 | <div align="center">
|
97 |
| -<img src="./assets/lld-operator.png" width="1000px"> |
| 34 | +<img src="./assets/architecture/2.png" width="1000px"> |
98 | 35 | </div>
|
99 | 36 |
|
100 |
| -## 4. Resolver |
| 37 | +### Scale up from 0 when the first request arrives. |
| 38 | + |
| 39 | +Since the service is scaled down to 0, all requests will hit the Elasti resolver. When the first request arrives, Elasti will scale up the service to the configured minTargetReplicas. It then resumes Keda to continue autoscaling in case there is a sudden burst of requests. It also changes the service to point to the actual service pods once the pod is up. The requests which came to ElastiResolver are retried till 5 mins and the response is sent back to the client. If the pod takes more than 5 mins to come up, the request is dropped. |
101 | 40 |
|
102 | 41 | <div align="center">
|
103 |
| -<img src="./assets/lld-resolver.png" width="800px"> |
| 42 | +<img src="./assets/architecture/3.png" width="1000px"> |
104 | 43 | </div>
|
105 | 44 |
|
106 |
| -## 5. Helm Values |
107 |
| - |
108 |
| -Values you can pass to elastiResolver env. |
109 |
| -```yaml |
110 |
| - |
111 |
| -# HeaderForHost is the header to look for to get the host. X-Envoy-Decorator-Operation is the key for istio |
112 |
| -headerForHost: X-Envoy-Decorator-Operation |
113 |
| -# InitialCapacity is the initial capacity of the semaphore |
114 |
| -initialCapacity: "500" |
115 |
| -maxIdleProxyConns: "100" |
116 |
| -maxIdleProxyConnsPerHost: "500" |
117 |
| -# MaxQueueConcurrency is the maximum number of concurrent requests |
118 |
| -maxQueueConcurrency: "100" |
119 |
| -# OperatorRetryDuration is the duration for which we don't inform the operator |
120 |
| -# about the traffic on the same host |
121 |
| -operatorRetryDuration: "10" |
122 |
| -# QueueRetryDuration is the duration after we retry the requests in queue |
123 |
| -queueRetryDuration: "3" |
124 |
| -# QueueSize is the size of the queue |
125 |
| -queueSize: "50000" |
126 |
| -# ReqTimeout is the timeout for each request |
127 |
| -reqTimeout: "120" |
128 |
| -# TrafficReEnableDuration is the duration for which the traffic is disabled for a host |
129 |
| -# This is also duration for which we don't recheck readiness of the service |
130 |
| -trafficReEnableDuration: "5" |
131 |
| -``` |
| 45 | + |
| 46 | +<div align="center"> |
| 47 | +<img src="./assets/architecture/4.png" width="1000px"> |
| 48 | +</div> |
0 commit comments