Skip to content

Commit

Permalink
Allow scheduling rules for operator pod (#328)
Browse files Browse the repository at this point in the history
This adds standard scheduling rules for the operator pod. The following
options are now available through the helm chart to control where the
operator pod is scheduled:
- nodeSelector
- affinity
- tolerations
- priorityClassName

OLM deployment won't be able to pick any of these options.
  • Loading branch information
spilchen authored Jan 30, 2023
1 parent 72cb9c5 commit 3d01530
Show file tree
Hide file tree
Showing 21 changed files with 465 additions and 13 deletions.
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,10 @@ E2E_ADDITIONAL_ARGS?=
# already. When deploying with random, it will randomly pick between olm and helm.
DEPLOY_WITH?=helm
export DEPLOY_WITH
# Clear this variable if you don't want to wait for the helm deployment to
# finish before returning control. This exists to allow tests to attempt deploy
# when it should fail.
DEPLOY_WAIT?=--wait
# Name of the test OLM catalog that we will create and deploy with in e2e tests
OLM_TEST_CATALOG_SOURCE=e2e-test-catalog

Expand Down Expand Up @@ -457,7 +461,7 @@ uninstall: manifests kustomize ## Uninstall CRDs from the K8s cluster specified
# If this secret does not exist then it is simply ignored.
deploy-operator: manifests kustomize ## Using helm or olm, deploy the operator in the K8s cluster
ifeq ($(DEPLOY_WITH), helm)
helm install --wait -n $(NAMESPACE) $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set logging.dev=${DEV_MODE} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES)
helm install $(DEPLOY_WAIT) -n $(NAMESPACE) $(HELM_RELEASE_NAME) $(OPERATOR_CHART) --set image.repo=null --set image.name=${OPERATOR_IMG} --set logging.dev=${DEV_MODE} --set image.pullPolicy=$(HELM_IMAGE_PULL_POLICY) --set imagePullSecrets[0].name=priv-reg-cred $(HELM_OVERRIDES)
scripts/wait-for-webhook.sh -n $(NAMESPACE) -t 60
else ifeq ($(DEPLOY_WITH), olm)
scripts/deploy-olm.sh -n $(NAMESPACE) $(OLM_TEST_CATALOG_SOURCE)
Expand Down
5 changes: 5 additions & 0 deletions changes/unreleased/Added-20230130-094333.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
kind: Added
body: Allow scheduling rules for operator pod
time: 2023-01-30T09:43:33.307685455-04:00
custom:
Issue: "328"
28 changes: 16 additions & 12 deletions helm-charts/verticadb-operator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,31 @@ This helm chart will install the operator and an admission controller webhook.

| Parameter Name | Description | Default Value |
|----------------|-------------|---------------|
| affinity | The [affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity) parameter allows you to constrain the operator pod only to specific nodes. If this parameter is not set, then no affinity setting is used with the operator pod. | Not set |
| image.name | The name of image that runs the operator. | vertica/verticadb-operator:1.10.0 |
| image.repo | Repo server hosting image.name | docker.io |
| image.pullPolicy | The pull policy for the image that runs the operator | IfNotPresent |
| nameOverride | Setting this allows you to control the prefix of all of the objects created by the helm chart. If this is left blank, we use the name of the chart as the prefix | |
| rbac_proxy_image.name | Image name of Kubernetes RBAC proxy. | kubebuilder/kube-rbac-proxy:v0.13.0 |
| rbac_proxy_image.repo | Repo server hosting rbac_proxy_image.name | gcr.io |
| imagePullSecrets | List of Secret names containing login credentials for above repos | null (pull images anonymously) |
| webhook.caBundle | A PEM encoded CA bundle that will be used to validate the webhook's server certificate. This option is deprecated in favour of providing the CA bundle in the webhook.tlsSecret with the ca.crt key. This option will be removed in a future release.| |
| webhook.certSource | The webhook requires a TLS certificate to work. This parm defines how the cert is supplied. Valid values are:<br><br>- **internal**: The certs are generated internally by the operator prior to starting the managing controller. The generated cert is self-signed. When it expires, the operator pod will need to be restarted in order to generate a new certificate. This is the default.<br><br>- **cert-manager**: The certs are generated using the cert-manager operator. This operator needs to be deployed before deploying the operator. Deployment of this chart will create a self-signed cert through cert-manager. The advantage of this over 'internal' is that cert-manager will automatically handle private key rotation when the certificate is about to expire.<br><br>- **secret**: The certs are created prior to installation of this chart and are provided to the operator through a secret. This option gives you the most flexibility as it is entirely up to you how the cert is created. This option requires the webhook.tlsSecret option to be set. For backwards compatibility, if webhook.tlsSecret is set, it is implicit that this mode is selected. | internal |
| webhook.tlsSecret | The webhook requires a TLS certficate to work. By default we create a cert internally. If you want full control over the cert that is created you can use this parameter to provide it. When set, it is a name of a secret in the same namespace the chart is being installed in. The secret must have the keys: tls.key and tls.crt. It can also include the key ca.crt. When that key is included the operator will patch it in the CA bundle in the webhook configuration.| |
| webhook.enable | If true, the webhook will be enabled and its configuration is setup by the helm chart. Setting this to false will disable the webhook. The webhook setup needs privileges to add validatingwebhookconfiguration and mutatingwebhookconfiguration, both are cluster scoped. If you do not have necessary privileges to add these configurations, then this option can be used to skip that and still deploy the operator. | true |
| logging.filePath | The path to the log file. If omitted, all logging will be written to stdout. | |
| logging.maxFileSize | The maximum size, in MB, of the logging file before log rotation occurs. This is only applicable if logging to a file. | 500 |
| logging.maxFileAge | The maximum number of days to retain old log files based on the timestamp encoded in the file. This is only applicable if logging to a file. |
| logging.maxFileRotation | The maximum number of files that are kept in rotation before the old ones are removed. This is only applicable if logging to a file. | 3 |
| logging.level | The minimum logging level. Valid values are: debug, info, warn, and error | info |
| logging.dev | Enables development mode if true and production mode otherwise. | false |
| serviceAccountNameOverride | If set, this will be the name of an existing service account that will be used to run any of the pods related to this operator. This includes the pod for the operator itself, as well as any pods created for our custom resource. If unset, we will use the default service account name. | |
| skipRoleAndRoleBindingCreation | Set this to true to force the helm chart to skip creation of any Roles and RoleBindings. This can only be used when the ServiceAccount already exists, so it expects serviceAccountNameOverride to have been used. <br><br> Use this option if you are installing the helm chart with k8s privileges that prevent you from creating Roles/RoleBindings. We provide the Roles and RoleBindings that the operator needs as an artifact of the GitHub release (see https://github.com/vertica/vertica-kubernetes/releases). | false |
| resources.\* | The resource requirements for the operator pod. | <pre>limits:<br> cpu: 100m<br> memory: 750Mi<br>requests:<br> cpu: 100m<br> memory: 20Mi</pre> |
| prometheus.expose | Controls exposing of the prometheus metrics endpoint. Valid options are:<br><br>- **EnableWithAuthProxy**: A new service object will be created that exposes the metrics endpoint. Access to the metrics are controlled by rbac rules using the proxy (see https://github.com/brancz/kube-rbac-proxy). The metrics endpoint will use the https scheme.<br><br>- **EnableWithoutAuth**: Like EnableWithAuthProxy, this will create a service object to expose the metrics endpoint. However, there is no authority checking when using the endpoint. Anyone who has network access to the endpoint (i.e. any pod in k8s) will be able to read the metrics. The metrics endpoint will use the http scheme.<br><br>- **Disable**: Prometheus metrics are not exposed at all. | EnableWithAuthProxy |
| prometheus.createServiceMonitor | Set this to true if you want to create a ServiceMonitor. This object is a CR provided by the prometheus operator to allow for easy service discovery. If set to true, the prometheus operator must be installed before installing this chart.<br> See: https://github.com/prometheus-operator/prometheus-operator<br><br>*This parameter is deprecated and will be removed in a future release.* | false |
| nameOverride | Setting this allows you to control the prefix of all of the objects created by the helm chart. If this is left blank, we use the name of the chart as the prefix | |
| nodeSelector | The [node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) provides control over which nodes are used to schedule a pod. If this parameter is not set, the node selector is omitted from the pod that is created by the operator's Deployment object. To set this parameter, provide a list of key/value pairs. | Not set |
| priorityClassName | The [priority class name](https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass) that is assigned to the operator pod. This affects where the pod gets scheduled. | Not set |
| prometheus.createProxyRBAC | Set this to false if you want to avoid creating the rbac rules for accessing the metrics endpoint when it is protected by the rbac auth proxy. By default, we will create those RBAC rules. | true |
| prometheus.createServiceMonitor | Set this to true if you want to create a ServiceMonitor. This object is a CR provided by the prometheus operator to allow for easy service discovery. If set to true, the prometheus operator must be installed before installing this chart.<br> See: https://github.com/prometheus-operator/prometheus-operator<br><br>*This parameter is deprecated and will be removed in a future release.* | false |
| prometheus.expose | Controls exposing of the prometheus metrics endpoint. Valid options are:<br><br>- **EnableWithAuthProxy**: A new service object will be created that exposes the metrics endpoint. Access to the metrics are controlled by rbac rules using the proxy (see https://github.com/brancz/kube-rbac-proxy). The metrics endpoint will use the https scheme.<br><br>- **EnableWithoutAuth**: Like EnableWithAuthProxy, this will create a service object to expose the metrics endpoint. However, there is no authority checking when using the endpoint. Anyone who has network access to the endpoint (i.e. any pod in k8s) will be able to read the metrics. The metrics endpoint will use the http scheme.<br><br>- **Disable**: Prometheus metrics are not exposed at all. | EnableWithAuthProxy |
| prometheus.tlsSecret | Use this if you want to provide your own certs for the prometheus metrics endpoint. It refers to a secret in the same namespace that the helm chart is deployed in. The secret must have the following keys set:<br><br>- **tls.key** – private key<br>- **tls.crt** – cert for the private key<br>- **ca.crt** – CA certificate<br><br>The prometheus.expose=EnableWithAuthProxy must be set for the operator to use the certs provided. If this field is omitted, the RBAC proxy sidecar will generate its own self-signed cert. | "" |
| rbac_proxy_image.name | Image name of Kubernetes RBAC proxy. | kubebuilder/kube-rbac-proxy:v0.13.0 |
| rbac_proxy_image.repo | Repo server hosting rbac_proxy_image.name | gcr.io |
| resources.\* | The resource requirements for the operator pod. | <pre>limits:<br> cpu: 100m<br> memory: 750Mi<br>requests:<br> cpu: 100m<br> memory: 20Mi</pre> |
| serviceAccountNameOverride | If set, this will be the name of an existing service account that will be used to run any of the pods related to this operator. This includes the pod for the operator itself, as well as any pods created for our custom resource. If unset, we will use the default service account name. | |
| skipRoleAndRoleBindingCreation | Set this to true to force the helm chart to skip creation of any Roles and RoleBindings. This can only be used when the ServiceAccount already exists, so it expects serviceAccountNameOverride to have been used. <br><br> Use this option if you are installing the helm chart with k8s privileges that prevent you from creating Roles/RoleBindings. We provide the Roles and RoleBindings that the operator needs as an artifact of the GitHub release (see https://github.com/vertica/vertica-kubernetes/releases). | false |
| tolerations | Any [tolerations and taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) used to influence where a pod is scheduled. This parameter is provided as a list. | Not set |
| webhook.caBundle | A PEM encoded CA bundle that will be used to validate the webhook's server certificate. This option is deprecated in favour of providing the CA bundle in the webhook.tlsSecret with the ca.crt key. This option will be removed in a future release.| |
| webhook.certSource | The webhook requires a TLS certificate to work. This parm defines how the cert is supplied. Valid values are:<br><br>- **internal**: The certs are generated internally by the operator prior to starting the managing controller. The generated cert is self-signed. When it expires, the operator pod will need to be restarted in order to generate a new certificate. This is the default.<br><br>- **cert-manager**: The certs are generated using the cert-manager operator. This operator needs to be deployed before deploying the operator. Deployment of this chart will create a self-signed cert through cert-manager. The advantage of this over 'internal' is that cert-manager will automatically handle private key rotation when the certificate is about to expire.<br><br>- **secret**: The certs are created prior to installation of this chart and are provided to the operator through a secret. This option gives you the most flexibility as it is entirely up to you how the cert is created. This option requires the webhook.tlsSecret option to be set. For backwards compatibility, if webhook.tlsSecret is set, it is implicit that this mode is selected. | internal |
| webhook.tlsSecret | The webhook requires a TLS certficate to work. By default we create a cert internally. If you want full control over the cert that is created you can use this parameter to provide it. When set, it is a name of a secret in the same namespace the chart is being installed in. The secret must have the keys: tls.key and tls.crt. It can also include the key ca.crt. When that key is included the operator will patch it in the CA bundle in the webhook configuration.| |
| webhook.enable | If true, the webhook will be enabled and its configuration is setup by the helm chart. Setting this to false will disable the webhook. The webhook setup needs privileges to add validatingwebhookconfiguration and mutatingwebhookconfiguration, both are cluster scoped. If you do not have necessary privileges to add these configurations, then this option can be used to skip that and still deploy the operator. | true |
61 changes: 61 additions & 0 deletions helm-charts/verticadb-operator/tests/pod-schedule_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
suite: test that control where the operator is scheduled
templates:
- verticadb-operator-controller-manager-deployment.yaml
tests:
- it: we can specify a node selector
set:
nodeSelector:
region: us-east
usage: operator
asserts:
- equal:
path: spec.template.spec.nodeSelector
value:
region: us-east
usage: operator
- it: we can specify affinity and anti-affinity rules
set:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- vertica
topologyKey: "kubernetes.io/hostname"
asserts:
- equal:
path: spec.template.spec.affinity
value:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- vertica
topologyKey: "kubernetes.io/hostname"
- it: we can specify a priorityClassName
set:
priorityClassName: pri1
asserts:
- equal:
path: spec.template.spec.priorityClassName
value: pri1
- it: we can specify a toleration
set:
tolerations:
- key: "example-key"
operator: "Exists"
effect: "NoSchedule"
asserts:
- equal:
path: spec.template.spec.tolerations[0]
value:
key: "example-key"
operator: "Exists"
effect: "NoSchedule"

27 changes: 27 additions & 0 deletions helm-charts/verticadb-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,33 @@ serviceAccountNameOverride: ""
# https://github.com/vertica/vertica-kubernetes/releases).
skipRoleAndRoleBindingCreation: false

# Add specific node selector labels to control where the server pod is scheduled.
# If left blank then no selectors are added.
# See: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
# key: value
nodeSelector: {}

# Add any affinity or anti-affinity to the pod to control where it gets scheduled.
# See: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity
# podAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# - labelSelector:
# matchExpressions:
# - key: security
# operator: In
# values:
# - S1
# topologyKey: topology.kubernetes.io/zone
affinity: {}

# PriorityClassName given to Pods of this StatefulSet
# See: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: ""

# Taints and tolerations.
# See: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations: []

prometheus:
# Controls exposing of the prometheus metrics endpoint. Valid options are:
#
Expand Down
19 changes: 19 additions & 0 deletions scripts/template-helm-chart.sh
Original file line number Diff line number Diff line change
Expand Up @@ -183,3 +183,22 @@ do
perl -i -0777 -pe 's/(volumes:)/$1\n{{- if not (empty .Values.prometheus.tlsSecret) }}\n - name: auth-cert\n secret:\n secretName: {{ .Values.prometheus.tlsSecret }}\n{{- end }}/g' $f
perl -i -0777 -pe 's/(name: kube-rbac-proxy)/$1\n{{- if not (empty .Values.prometheus.tlsSecret) }}\n volumeMounts:\n - mountPath: \/cert\n name: auth-cert\n{{- end }}/g' $f
done

# 18. Add pod scheduling options
cat << EOF >> $TEMPLATE_DIR/verticadb-operator-controller-manager-deployment.yaml
{{- if .Values.nodeSelector }}
nodeSelector:
{{- toYaml .Values.nodeSelector | nindent 8 }}
{{- end }}
{{- if .Values.affinity }}
affinity:
{{- toYaml .Values.affinity | nindent 8 }}
{{- end }}
{{- if .Values.priorityClassName }}
priorityClassName: {{ .Values.priorityClassName }}
{{- end }}
{{- if .Values.tolerations }}
tolerations:
{{- toYaml .Values.tolerations | nindent 8 }}
{{- end }}
EOF
18 changes: 18 additions & 0 deletions tests/e2e-leg-3/operator-pod-scheduling/05-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# (c) Copyright [2021-2022] Micro Focus or one of its affiliates.
# Licensed under the Apache License, Version 2.0 (the "License");
# You may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: pri-operator-pod-scheduling
value: 1000000
Loading

0 comments on commit 3d01530

Please sign in to comment.