Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume mount break Pod startup #3809

Closed
sebgoa opened this issue Apr 18, 2019 · 25 comments · Fixed by #9683
Closed

volume mount break Pod startup #3809

sebgoa opened this issue Apr 18, 2019 · 25 comments · Fixed by #9683
Assignees
Labels
area/API API objects and controllers kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Milestone

Comments

@sebgoa
Copy link
Contributor

sebgoa commented Apr 18, 2019

In what area(s)?

/area API

What version of Knative?

HEAD

Expected Behavior

Run the nginx container as a service by specifying a non default port (.e port 80). I expected the service to run successfully and be able to access the nginx welcome page.

Actual Behavior

The Pod did not start due to volume mount issues :

2019-04-18T16:41:52.643263735Z nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (2: No such file or directory)
2019-04-18T16:41:52.649571891Z 2019/04/18 16:41:52 [emerg] 1#1: open() "/var/log/nginx/error.log" failed (2: No such file or directory)          

Pod manifest:

metadata:
    name: ffff-hcxmd-deployment-544bc9d587-5qhvs
    generateName: ffff-hcxmd-deployment-544bc9d587-
    namespace: sebgoa
    selfLink: /api/v1/namespaces/sebgoa/pods/ffff-hcxmd-deployment-544bc9d587-5qhvs
    labels:
        app: ffff-hcxmd
        pod-template-hash: 544bc9d587
        serving.knative.dev/configuration: ffff
        serving.knative.dev/configurationGeneration: '1'
        serving.knative.dev/configurationMetadataGeneration: '1'
        serving.knative.dev/revision: ffff-hcxmd
        serving.knative.dev/revisionUID: 0ad1904a-61f8-11e9-ac42-42010a800170
        serving.knative.dev/service: ffff
    annotations:
        sidecar.istio.io/inject: 'true'
        sidecar.istio.io/status: '{"version":"c3e1cae4ba6edc90052026dd7913ae40955b8500a82aae7245ab0d1059f37e54","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs"],"imagePullSecrets":null}'
        traffic.sidecar.istio.io/includeOutboundIPRanges: '*'
    ownerReferences:
        -
            apiVersion: apps/v1
            kind: ReplicaSet
            name: ffff-hcxmd-deployment-544bc9d587
            uid: 0b2536b6-61f8-11e9-ac42-42010a800170
            controller: true
            blockOwnerDeletion: true
spec:
    volumes:
        -
            name: varlog
            emptyDir: {}
        -
            name: default-token-bbrzm
            secret: {secretName: default-token-bbrzm, defaultMode: 420}
        -
            name: istio-envoy
            emptyDir: {medium: Memory}
        -
            name: istio-certs
            secret: {secretName: istio.default, defaultMode: 420, optional: true}
    initContainers:
        -
            name: istio-init
            image: 'docker.io/istio/proxy_init:1.0.7'
            args: ['-p', '15001', '-u', '1337', '-m', REDIRECT, '-i', '*', '-x', "", '-b', '80,8012,8022,9090', '-d', ""]
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            imagePullPolicy: IfNotPresent
            securityContext: {capabilities: {add: [NET_ADMIN]}, privileged: true, procMount: Default}
    containers:
        -
            name: user-container
            image: 'index.docker.io/library/nginx@sha256:e71b1bf4281f25533cf15e6e5f9be4dac74d2328152edf7ecde23abc54e16c1c'
            ports: [{name: user-port, containerPort: 80, protocol: TCP}]
            env: [{name: PORT, value: '80'}, {name: K_REVISION, value: ffff-hcxmd}, {name: K_CONFIGURATION, value: ffff}, {name: K_SERVICE, value: ffff}]
            resources: {requests: {cpu: 400m}}
            volumeMounts: [{name: varlog, mountPath: /var/log}, {name: default-token-bbrzm, readOnly: true, mountPath: /var/run/secrets/kubernetes.io/serviceaccount}]
            lifecycle: {preStop: {httpGet: {path: /wait-for-drain, port: 8022, scheme: HTTP}}}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: FallbackToLogsOnError
            imagePullPolicy: IfNotPresent
        -
            name: queue-proxy
            image: 'gcr.io/triggermesh/queue-7204c16e44715cd30f78443fb99e0f58@sha256:96d64c803ce7b145b4250ec89aabe4acde384b1510c8e4e117b590bfd15eb7a3'
            ports: [{name: queue-port, containerPort: 8012, protocol: TCP}, {name: queueadm-port, containerPort: 8022, protocol: TCP}, {name: queue-metrics, containerPort: 9090, protocol: TCP}]
            env: [{name: SERVING_NAMESPACE, value: sebgoa}, {name: SERVING_SERVICE, value: ffff}, {name: SERVING_CONFIGURATION, value: ffff}, {name: SERVING_REVISION, value: ffff-hcxmd}, {name: SERVING_AUTOSCALER, value: autoscaler}, {name: SERVING_AUTOSCALER_PORT, value: '8080'}, {name: CONTAINER_CONCURRENCY, value: '0'}, {name: REVISION_TIMEOUT_SECONDS, value: '300'}, {name: SERVING_POD, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: metadata.name}}}, {name: SERVING_POD_IP, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: status.podIP}}}, {name: SERVING_LOGGING_CONFIG, value: "{\n  \"level\": \"info\",\n  \"development\": false,\n  \"outputPaths\": [\"stdout\"],\n  \"errorOutputPaths\": [\"stderr\"],\n  \"encoding\": \"json\",\n  \"encoderConfig\": {\n    \"timeKey\": \"ts\",\n    \"levelKey\": \"level\",\n    \"nameKey\": \"logger\",\n    \"callerKey\": \"caller\",\n    \"messageKey\": \"msg\",\n    \"stacktraceKey\": \"stacktrace\",\n    \"lineEnding\": \"\",\n    \"levelEncoder\": \"\",\n    \"timeEncoder\": \"iso8601\",\n    \"durationEncoder\": \"\",\n    \"callerEncoder\": \"\"\n  }\n}"}, {name: SERVING_LOGGING_LEVEL, value: info}, {name: SERVING_REQUEST_LOG_TEMPLATE}, {name: SERVING_REQUEST_METRICS_BACKEND}, {name: USER_PORT, value: '80'}, {name: SYSTEM_NAMESPACE, value: knative-serving}]
            resources: {requests: {cpu: 25m}}
            volumeMounts: [{name: default-token-bbrzm, readOnly: true, mountPath: /var/run/secrets/kubernetes.io/serviceaccount}]
            readinessProbe: {httpGet: {path: /health, port: 8022, scheme: HTTP}, timeoutSeconds: 10, periodSeconds: 1, successThreshold: 1, failureThreshold: 3}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            imagePullPolicy: IfNotPresent
        -
            name: istio-proxy
            image: 'docker.io/istio/proxyv2:1.0.7'
            args: [proxy, sidecar, '--configPath', /etc/istio/proxy, '--binaryPath', /usr/local/bin/envoy, '--serviceCluster', ffff-hcxmd, '--drainDuration', 45s, '--parentShutdownDuration', 1m0s, '--discoveryAddress', 'istio-pilot.istio-system:15007', '--discoveryRefreshDelay', 1s, '--zipkinAddress', 'zipkin.istio-system:9411', '--connectTimeout', 10s, '--proxyAdminPort', '15000', '--controlPlaneAuthPolicy', NONE]
            ports: [{name: http-envoy-prom, containerPort: 15090, protocol: TCP}]
            env: [{name: POD_NAME, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: metadata.name}}}, {name: POD_NAMESPACE, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: metadata.namespace}}}, {name: INSTANCE_IP, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: status.podIP}}}, {name: ISTIO_META_POD_NAME, valueFrom: {fieldRef: {apiVersion: v1, fieldPath: metadata.name}}}, {name: ISTIO_META_INTERCEPTION_MODE, value: REDIRECT}, {name: ISTIO_METAJSON_ANNOTATIONS, value: "{\"sidecar.istio.io/inject\":\"true\",\"traffic.sidecar.istio.io/includeOutboundIPRanges\":\"*\"}\n"}, {name: ISTIO_METAJSON_LABELS, value: "{\"app\":\"ffff-hcxmd\",\"pod-template-hash\":\"544bc9d587\",\"serving.knative.dev/configuration\":\"ffff\",\"serving.knative.dev/configurationGeneration\":\"1\",\"serving.knative.dev/configurationMetadataGeneration\":\"1\",\"serving.knative.dev/revision\":\"ffff-hcxmd\",\"serving.knative.dev/revisionUID\":\"0ad1904a-61f8-11e9-ac42-42010a800170\",\"serving.knative.dev/service\":\"ffff\"}\n"}]
            resources: {requests: {cpu: 10m}}
            volumeMounts: [{name: istio-envoy, mountPath: /etc/istio/proxy}, {name: istio-certs, readOnly: true, mountPath: /etc/certs/}]
            lifecycle: {preStop: {exec: {command: [sh, '-c', 'sleep 20; while [ $(netstat -plunt | grep tcp | grep -v envoy | wc -l | xargs) -ne 0 ]; do sleep 1; done']}}}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            imagePullPolicy: IfNotPresent
            securityContext: {runAsUser: 1337, readOnlyRootFilesystem: true, procMount: Default}
    restartPolicy: Always
    terminationGracePeriodSeconds: 300
    dnsPolicy: ClusterFirst
    serviceAccountName: default
    serviceAccount: default
    securityContext: {}
    schedulerName: default-scheduler
    tolerations:
        -
            key: node.kubernetes.io/not-ready
            operator: Exists
            effect: NoExecute
            tolerationSeconds: 300
        -
            key: node.kubernetes.io/unreachable
            operator: Exists
            effect: NoExecute
            tolerationSeconds: 300
    priority: 0

Steps to Reproduce the Problem

apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
    annotations:
        serving.knative.dev/creator: 'system:serviceaccount:sebgoa:default'
        serving.knative.dev/lastModifier: 'system:serviceaccount:sebgoa:default'
    name: fooport
spec:
    runLatest:
        configuration:
            revisionTemplate: {metadata: {creationTimestamp: null}, spec: {container: {image: nginx, name: "", ports: [{containerPort: 80}], resources: {requests: {cpu: 400m}}}, timeoutSeconds: 300}}

@sebgoa sebgoa added the kind/bug Categorizes issue or PR as related to a bug. label Apr 18, 2019
@TommyLike
Copy link
Contributor

Knative will mount an empty log folder at the path /var/log, at the meantime, the default nginx config expects to have a folder nginx exists in that folder.

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/log/nginx/run.pid;

@dgerd
Copy link

dgerd commented Apr 22, 2019

Knative needs to create a volume mounted at /var/log so that a fluentd sidecar can collect logs. This is going to wipeout anything that is pre-configured at this directory.

It is not clear to me that this is really a Knative bug, but rather an incompatibility between the container as packaged and the platform. It is possible to get the nginx container to run without rebuilding the image by simply updating Cmd/Args to do something like mkdir -p /var/log/nginx && nginx -g "daemon off";.

If you are rebuilding the nginx image anyways to update other configuration, you can also update the logging directory to remove the nginx sub-directory, or update the default command.

I do see how it would be slick to run nginx on Knative with little to no arguments, but I am not sure that will be a common enough use-case outside of a demo to warrant adding additional configuration/flags around fluentd mount behavior.

@sebgoa
Copy link
Contributor Author

sebgoa commented Apr 23, 2019

But there is no fluentd sidecar here.

fluentd is usually running as a daemonset cluster wide and collects logs automatically without having to run it as a sidecar.

@dgerd
Copy link

dgerd commented Apr 23, 2019

The fluentd sidecar is controlled by logging.enable-var-log-collection in config-observability and is defaulted to false.

Running fluentd as a side-car has the advantage of being able to process separate log streams from a single container versus just stdout/stderr. Processing these log files is done through a shared mounted volume. There is a proposal to move our sidecar usage to a daemon set for both stdout/stderr logs as well as /var/log logs ( #818 ); however, even that still requires a mounted volume in the user container.

I could see an argument for disabling the volume mount when varlog collection is disabled; however, that has the potential to break other applications that assume that /var/log already exists and is writable. Volume mounts seem like the best way Knative can ensure that /var/log exists and is writable as specified in our runtime contract. https://github.com/knative/serving/blob/master/docs/runtime-contract.md#default-filesystems

@sebgoa
Copy link
Contributor Author

sebgoa commented Apr 24, 2019

I commented in #818 , I agree with the premise that sidecars are "resource heavy", specifically I think it will affect performance a lot (i.e cold start).

While running nginx might sound a bit demo'ish, I believe that being able to run simple scenarios like this should be doable out of the box.

I don't understand this statement:

I could see an argument for disabling the volume mount when varlog collection is disabled; however, that has the potential to break other applications that assume that /var/log already exists and is writable.

To me the sidecar should definitely be an option that is configurable (like now, great), but that option being off should not touch the file system at all. It should be an opt-in , where users know the consequences and will prepare their images accordingly.

@dgerd
Copy link

dgerd commented Apr 24, 2019

Our options here seem like:

  1. Always mount /var/log/ into the user container (Current Behavior)
  2. Mount /var/log/ into user container only when fluentd sidecar is enabled
  3. Never mount /var/log/ into user container

Option 1 has the advantages of a consistent user experience regardless of which log collection method is used and getting an application to work in Knative is done upfront. This makes it portable across Knative installations regardless of configuration. The downside to option 1 is what we are experiencing here. Applications that expect files or subdirectories to exist in /var/log at startup will not work out of the box.

Option 2 has the advantage that containers like nginx will launch on Knative out of the box, but will break if the volume mount is enabled. As log collection is currently a cluster-wide setting this is a bit concerning. The disadvantage is portability across Knative installs is reduced since cluster settings are unlikely to be the same.

Option 3 has the advantage of consistent use experience and containers like nginx will run out of the box. However, it removes functionality that already exists and makes it difficult to run containers that output multiple log files (i.e. request, application, error, etc.). Also containers that try to write to /var/log/{filename}, but did not explicitly create the directory path during image build will fail. I am not sure how many containers fall into this category since most distro base images (i.e. alpine, ubuntu, debian, etc.) create this path for you.

I think this might be a good topic to bring up in the API working group to see if others have any additional input or ideas here.

@cometta
Copy link

cometta commented May 19, 2019

Hello @sebgoa, may I know is securityContext: privileged: true, a valid syntax? question

@sebgoa
Copy link
Contributor Author

sebgoa commented May 20, 2019

Yes, because now a Container object can be used in a Service template.

See: https://github.com/knative/serving/blob/master/docs/spec/spec.md#service

@cometta
Copy link

cometta commented May 20, 2019

oh, i able to run apply in 0.52 but not in 0.6.0 . It throws below error

Error from server (InternalError): error when creating "service.yaml": Internal error occurred: admission webhook "webhook.serving.knative.dev" denied the request: mutation failed: must not set the field(s): spec.runLatest.configuration.revisionTemplate.spec.container.securityContext.privileged

May I know which version are you tested on privileged mode?

@greghaynes
Copy link
Contributor

This seems like a duplicate of #2142

@JRBANCEL
Copy link
Contributor

JRBANCEL commented Jun 6, 2019

Now (#4156) that there is no fluentd sidecar we could also get rid of the emptyDir.
@evankanderson & @dgerd mentionned that we should be able to read directly from the container content.

If there is no volume mounted to /var/log, its content is located at /var/lib/docker/overlay2/<ID>/merged/var/log. We could have a postStart hook create the following symlink (from within the container) /knative-var-log/<NAMESPACE>_<POD>_<CONTAINER_NAME> -> var/log. That's enough for fluentd to capture logs on the host at /var/lib/docker/overlay2/*/merged/knative-var-log/**/*.

This way, we wouldn't overwrite anything inside /var/log.
The problem with this solution is that it is assuming OverlayFS is used. That's not portable to other CRI/Drivers.

The better solution would be to have k8s support mounting emptyDir without erasing the original content.

@dgerd
Copy link

dgerd commented Jun 12, 2019

While I do like that the solution gives the best runtime behavior of not overwriting /var/log while still enabling log collection from that location, the tie to overlayFS and Docker is terrifying and seems brittle. It will also leave many Knative installs without a working solution.

Until we have a solution like above that is container runtime agnostic I would rather us do something else to enable this. I propose the following:

Allow users to add a label, serving.knative.dev/varLog=disable, to their Knative Service or Knative Configuration. When the label is present it removes the /var/log/ mount from the Pod deployment and disables log collection from var log. This gives users control on a per-Service basis which removes the blast-radius problem of putting this in a cluster-wide ConfigMap and enables users to run containers such as nginx without having to customize CMD/ARGS and without having to rebuild the container.

Precedent: This is similar to our cluster-local.

FAQ:

Q. Why a label instead of adding it into the spec?
A. This is not something that we expect to live within the API forever and is not something that we expect to be a problem on all implementations of Knative.

Q. What happens when we get support for mounting an emptyDir without erasing original content?
A. The flag becomes a no-op on the Knative Service.

Q. Why an opt-out over an opt-in?
A. Collection of /var/log by default is the direction we want to go as it enables more advanced logging options than stdout/stderr by itself. Requiring users to opt-in adds friction to adopting this more flexible logging pattern and can silently result in lost logs.

Q. Can we name the label key or value something else?
A. Yes. Sure. Naming is hard and we can bikeshed on that.

@sebgoa @greghaynes @JRBANCEL Thoughts?

@mattmoor mattmoor added this to the Ice Box milestone Jun 26, 2019
@mattmoor mattmoor added the area/API API objects and controllers label Jun 26, 2019
@knative-housekeeping-robot

Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/lifecycle stale

@knative-prow-robot knative-prow-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 23, 2019
@knative-housekeeping-robot

Stale issues rot after 30 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle rotten.
Rotten issues close after an additional 30 days of inactivity.
If this issue is safe to close now please do so by adding the comment /close.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/lifecycle rotten

@knative-prow-robot knative-prow-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 22, 2020
@knative-housekeeping-robot

Rotten issues close after 30 days of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh by adding the comment /remove-lifecycle rotten.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/close

@knative-prow-robot
Copy link
Contributor

@knative-housekeeping-robot: Closing this issue.

In response to this:

Rotten issues close after 30 days of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh by adding the comment /remove-lifecycle rotten.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mreferre added a commit to mreferre/nginx-custom-site that referenced this issue Feb 29, 2020
@ReggieCarey
Copy link

/reopen
/remove-lifecycle rotten

It would be extremely beneficial to be able to disable /var/log overwriting. This problem appears to break any app using nginx to front a service. There are tons of them out there.

Sidecars are fine. But DO NOT modify my containers. This steps outside of Knative area of concern. Knative should serve up an untouched container.

Logging is far down the list of importance if the application cannot be deployed at all.

This is a pretty urgent issue and it needs to be addressed quickly. I'm new to the Knative community but this is just broken. Knative overwriting any directory in a container is a violation of trust that container integrity will not be violated by the infrastructure.

@knative-prow-robot
Copy link
Contributor

@ReggieCarey: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
/remove-lifecycle rotten

It would be extremely beneficial to be able to disable /var/log overwriting. This problem appears to break any app using nginx to front a service. There are tons of them out there.

Sidecars are fine. But DO NOT modify my containers. This steps outside of Knative area of concern. Knative should serve up an untouched container.

Logging is far down the list of importance if the application cannot be deployed at all.

This is a pretty urgent issue and it needs to be addressed quickly. I'm new to the Knative community but this is just broken. Knative overwriting any directory in a container is a violation of trust that container integrity will not be violated by the infrastructure.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 4, 2020
@JRBANCEL
Copy link
Contributor

JRBANCEL commented Jun 4, 2020

/reopen
/remove-lifecycle rotten

There were discussions about removing this feature since as you mentioned it really is out of scope of Knative.

/cc @dprotaso

@knative-prow-robot
Copy link
Contributor

@JRBANCEL: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

There were discussions about removing this feature since as you mentioned it really is out of scope of Knative.

/cc @dprotaso

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Morriz
Copy link

Morriz commented Sep 1, 2020

what is the status here? we are demoing our platform which is heavily relying on knative and all our customers stumble into this problem, making our statement that "it just works by deploying a knative service" look foolish.

How can we be sure that containers are left untouched? This is unworkable

@mattmoor
Copy link
Member

mattmoor commented Sep 2, 2020

/lifecycle frozen

@knative-prow-robot knative-prow-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Sep 2, 2020
@dprotaso
Copy link
Member

@Morriz we'll phase out the behaviour that's breaking these containers -
#7881 (comment)

tl;dr we will put it behind an operator feature flag and change the default after enough time and communication

@dprotaso
Copy link
Member

/assign @dprotaso
/milestone 0.18.x

@knative-prow-robot
Copy link
Contributor

@dprotaso: You must be a member of the knative/knative-milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/assign @dprotaso
/milestone 0.18.x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet