Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent issues with --tag #383

Closed
duglin opened this issue Aug 22, 2019 · 14 comments
Closed

Intermittent issues with --tag #383

duglin opened this issue Aug 22, 2019 · 14 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@duglin
Copy link
Contributor

duglin commented Aug 22, 2019

In what area(s)?

What version of Knative Client?

master

What version of Knative Serving running on your cluster?

0.8.x

Issue

Maybe 10% of the time when I run this command I see this:

$ kn service update echo --revision-name=echo-v3 --tag echo-v3=test --env MSG="TEST TEST TEST"
Waiting for service 'echo' to become ready ... 
RevisionMissing: Revision "echo-v3" referenced in traffic not found.

What's odd is that it only fails some of the time, so there must be a timing issue involved here.

Will try to do more debugging but wanted to let @navidshaikh know that something's up...

@duglin duglin added the kind/bug Categorizes issue or PR as related to a bug. label Aug 22, 2019
@navidshaikh
Copy link
Collaborator

I dont think if this is something we control, seems like revision echo-v3 wasn't available when traffic spec is parsed by serving.

Also this is not re-producible easily as you mentioned.

➜ kn service create echo --image gcr.io/knative-samples/helloworld-go
Service 'echo' successfully created in namespace 'default'.
Waiting for service 'echo' to become ready ... OK

Service URL:
http://echo.default.apps-crc.testing

➜ kn service update echo --revision-name=echo-v3 --tag echo-v3=test --env MSG="TEST TEST TEST"
Waiting for service 'echo' to become ready ... OK
Service 'echo' updated in namespace 'default'.

➜ kn revision list
NAME           SERVICE   GENERATION   AGE   CONDITIONS   READY   REASON
echo-v3        echo      2            23s   4 OK / 4     True    
echo-mbszk-1   echo      1            75s   4 OK / 4     True    

➜ kn service list
NAME   URL                                    GENERATION   AGE   CONDITIONS   READY   REASON
echo   http://echo.default.apps-crc.testing   2            82s   3 OK / 3     True   

In kn, when user ran
$ kn service update echo --revision-name=echo-v3 --tag echo-v3=test --env MSG="TEST TEST TEST"
we generated spec similar to what user could've written in YAML and posted to serving. Wondering if its something to be looked at serving side.

@duglin
Copy link
Contributor Author

duglin commented Aug 24, 2019

I haven’t had a chance to look at the code yet, do you do two requests to the server or just one? I think if you did a request then it would always work.

@navidshaikh
Copy link
Collaborator

@duglin : We do only one request.

@sixolet
Copy link
Contributor

sixolet commented Aug 25, 2019 via email

@duglin
Copy link
Contributor Author

duglin commented Aug 25, 2019

I created this script to reproduce it w/o kn so I could open an issue on serving:

#!/bin/bash
set -ex

kubectl delete ksvc/echo || true
sleep 5

kubectl apply -f - <<EOF
apiVersion: serving.knative.dev/v1beta1
kind: Service
metadata:
  name: echo
spec:
  template:
    metadata:
      name: echo-v1
    spec:
      containers:
      - image: duglin/echo
  traffic:
  - revisionname: echo-v1
    percent: 100
EOF

sleep 5

while true ; do

val=$RANDOM

kubectl apply -f - <<EOF
apiVersion: serving.knative.dev/v1beta1
kind: Service
metadata:
  name: echo
spec:
  template:
    metadata:
      name: echo-v$val
    spec:
      containers:
      - image: duglin/echo
        env:
        - name: MSG
          value: val$RANDOM
  traffic:
  - revisionName: echo-v1
    percent: 90
  - revisionName: echo-v$val
    tag: test
    percent: 10
EOF

if kubectl get ksvc/echo -o yaml | grep "not found" ; then
  exit 1
fi

sleep 10

done

however, I noticed that after I got the error, eventually the server did end up in good state. This makes me wonder whether this is really a serving issue or a kn issue. Meaning, should kn do a better job of detecting this intermediate error state and then wait bit more for things to settle down before it decides to return an error to the user?

Of course, even if that's the case, it does make it kind of hard for any client to know when the error it's seeing is "real" vs "temporary".

@mattmoor what do you think? Should this error ever be the conditions section at all?

@sixolet
Copy link
Contributor

sixolet commented Aug 25, 2019 via email

@duglin
Copy link
Contributor Author

duglin commented Aug 26, 2019

Just FYI so we have it, when the error happens the status section looks like this:

status:
  address:
    url: http://echo.default.svc.cluster.local
  conditions:
  - lastTransitionTime: "2019-08-25T23:45:25Z"
    message: The Configuration is still working to reflect the latest desired specification.
    reason: OutOfDate
    status: Unknown
    type: ConfigurationsReady
  - lastTransitionTime: "2019-08-25T23:45:25Z"
    message: Revision "echo-v392" referenced in traffic not found.
    reason: RevisionMissing
    status: "False"
    type: Ready
  - lastTransitionTime: "2019-08-25T23:45:25Z"
    message: Revision "echo-v392" referenced in traffic not found.
    reason: RevisionMissing
    status: "False"
    type: RoutesReady
  latestCreatedRevisionName: echo-v16289
  latestReadyRevisionName: echo-v16289

@duglin duglin changed the title Intermittend issues with --tag Intermittent issues with --tag Aug 27, 2019
@duglin
Copy link
Contributor Author

duglin commented Sep 1, 2019

@sixolet can you point me to where it says that in the RTC? I can't seem to find it.

@mattmoor @dgerd is this a server-side issue or should the cli be checking something else to know when things are settled down?

@mattmoor
Copy link
Member

mattmoor commented Sep 1, 2019

knative/serving#4173 ?

@duglin
Copy link
Contributor Author

duglin commented Sep 3, 2019

Sure sounds like it.

@duglin
Copy link
Contributor Author

duglin commented Sep 16, 2019

knative/serving#5547 is in now - will retest soon

@dgerd
Copy link

dgerd commented Sep 16, 2019

@duglin Let me know if you have any further problems here.

@navidshaikh
Copy link
Collaborator

/close

@duglin tested using the script you provided above, 10+ updates to service and it worked fine. Closing the issue, please re-open if you observe otherwise.
Thanks @dgerd for the fix!

@knative-prow-robot
Copy link
Contributor

@navidshaikh: Closing this issue.

In response to this:

/close

@duglin tested using the script you provided above, 10+ updates to service and it worked fine. Closing the issue, please re-open if you observe otherwise.
Thanks @dgerd for the fix!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Kaustubh-pande pushed a commit to Kaustubh-pande/client that referenced this issue Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants