Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[codelab-intro-mnist-feedback]:build #609

Closed
wanglj-lanzer opened this issue Jul 25, 2019 · 31 comments
Closed

[codelab-intro-mnist-feedback]:build #609

wanglj-lanzer opened this issue Jul 25, 2019 · 31 comments

Comments

@wanglj-lanzer
Copy link

kustomize build .

Error: no matches for OriginalId kubeflow.org_v1beta1_TFJob|~X|$(trainingName); no matches for CurrentId kubeflow.org_v1beta1_TFJob|~X|$(trainingName); failed to find unique target
for patch kubeflow.org_v1beta1_TFJob|$(trainingName)

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label kind/bug to this issue, with a confidence of 0.83. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@miguelvcsoares
Copy link

I encountered the same issue. What can we do to solve it?

@jinchihe
Copy link
Member

@wanglj-lanzer Which version of kustomize you used? we hit some problems due to kustomize bug, and seuggested (noted) using kustomize v2.0.3 in README file.

And which platform you tried, GCP or on-prem cluster? Thanks.

@wanglj-lanzer
Copy link
Author

GCP

I installed the kustomize via go get sigs.k8s.io/kustomize/cmd/kustomize

@jinchihe
Copy link
Member

jinchihe commented Jul 26, 2019

See notes in the readme file
Note: kustomize v2.0.3 is recommented since the problem in kustomize v2.1.0

Could you please try this by using v2.0.3 and following the Readme file? New kustomize has bug here.

@wanglj-lanzer
Copy link
Author

new errors after reinstall kustomize as the recommended version

Error: rawResources failed to read Resources: Load from path ../base failed: '../base' must be a file (got d='~/examples/mnist/training/base')

@andrewcantos
Copy link

I have also experienced same issues as @wanglj-lanzer – both on latest kustomize and v2.0.3.

@jinchihe
Copy link
Member

Following the README in the repo, that should work fine. Thanks.

@andrewcantos
Copy link

Unfortunately it doesn't work as expected on my machine. I have ensured I am running v2.0.3 using cloned latest from (this) examples repo. I have followed README instructions and error:

Error: rawResources failed to read Resources: Load from path ../base failed: '../base' must be a file (got d='~/kubeflow_examples/mnist/training/base')

Note: I ran this from mnist/training/GCS directory.

I was able to successfully build from local directory.

@jinchihe
Copy link
Member

jinchihe commented Jul 30, 2019

@andrewcantos @wanglj-lanzer I'm not sure which steps has the problem? But that works fine on my env, below is a example, thanks.:

[root@jinchi1 GCS]# kustomize version
Version: {KustomizeVersion:2.0.3 GitCommit:a6f65144121d1955266b0cd836ce954c04122dc8 BuildDate:2019-03-05T20:37:42Z GoOs:linux GoArch:amd64}
[root@jinchi1 GCS]# BUCKET=distributed-test
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=name=mnist-train-dist
[root@jinchi1 GCS]# kustomize edit set image training-image=docker.io/ibmlei/mykubeflow:0.1
[root@jinchi1 GCS]# ../base/definition.sh --numPs 1 --numWorkers 2
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=trainSteps=200
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=batchSize=100
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=learningRate=0.01
[root@jinchi1 GCS]# MODEL_PATH=my-model
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=modelDir=gs://${BUCKET}/${MODEL_PATH}
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=exportDir=gs://${BUCKET}/${MODEL_PATH}/export
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=secretName=user-gcp-sa
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=secretMountPath=/var/secrets
[root@jinchi1 GCS]# kustomize edit add configmap mnist-map-training --from-literal=GOOGLE_APPLICATION_CREDENTIALS=/var/secrets/user-gcp-sa.json
[root@jinchi1 GCS]# kustomize build . |kubectl apply -f -
configmap/mnist-map-training-gfththd74g created
tfjob.kubeflow.org/mnist-train-dist created

@ricoms
Copy link

ricoms commented Aug 6, 2019

I'm following this tutorial. I'm pretty sure I followed the instructions correctly and I reached the same problem at step kustomize build .

$ kustomize build .
Error: no matches for OriginalId kubeflow.org_v1beta2_TFJob|~X|$(trainingName); no matches for CurrentId kubeflow.org_v1beta2_TFJob|~X|$(trainingName); failed to find unique target for patch kubeflow.org_v1beta2_TFJob|$(trainingName)

I installed kustomize from "Quickly curl the latest" here

my version is as below:
Version: {KustomizeVersion:3.1.0 GitCommit:95f3303493fdea243ae83b767978092396169baf BuildDate:2019-07-26T18:11:16Z GoOs:linux GoArch:amd64}

@jinchihe
Copy link
Member

jinchihe commented Aug 6, 2019

@ricoms use kustomize v2.0.3 please, new kustomize has bugs here.

@ricoms
Copy link

ricoms commented Aug 6, 2019

using v2.0.3 I fell on the issue you opened @jinchihe #1029 when I applied kustomize edit set image training-image=$IMAGE_PATH

I created a tag with export TAG=$(date +"%m-%d-%Y") and add it to the IMAGE_PATH like this export IMAGE_PATH=us.gcr.io/$PROJECT_ID/kubeflow-train:$TAG.

Then I was able to apply kustomize build . | kubectl apply -f - with success, the respose was a yaml file and below:

configmap/mnist-map-training-4c92g6hh6b unchanged
tfjob.kubeflow.org/my-train-1 unchanged

But the original "bug" seems so "simple"... the problem appears to be the trainingName variable, as pointed here. The kustomization.yaml requires trainingName variable and it is not being set anywhere, it seems. I would like to solve this problem by upgrading the tutorial to kustomize 3.1.0. Is it very hard? Could someone point me to the problem and where could I help?

@jinchihe
Copy link
Member

jinchihe commented Aug 7, 2019

@ricoms See here kubernetes-sigs/kustomize#1295 I think that has been fixed, and code may be merged in 3.x.

@lucastheis
Copy link

Calling kustomize_3.x.x edit [...] once replaces

bases:
- ../base

with

resources:
- ../base

which leads to Error: rawResources failed to read Resources: Load from path ../base failed. Reverting that change and using kustomize_2.0.3 fixed it for me.

@sarika-pst
Copy link

Getting the same error
kustomize build . |kubectl apply -f - Error: no matches for OriginalId extensions_v1beta1_Deployment|~X|$(svcName); no matches for CurrentId extensions_v1beta1_Deployment|~X|$(svcName); failed to find unique target for patch extensions_v1beta1_Deployment|$(svcName) error: no objects passed to apply
Any solution for this?
Thanks in advance.

@jinchihe
Copy link
Member

@aicdp1010 try it using kustomize v2.0.3 please.

@sarika-pst
Copy link

Above error is resolved using v2.0.3 but now I am getting

$ kustomize build . |kubectl apply -f -
configmap/mnist-map-training-2fg9c7886m created
error: unable to recognize "STDIN": no matches for kind "TFJob" in version "kubeflow.org/v1beta2"

I am new to this, Is the issue related to kubernetes cluster. Please guide me.
Thank you.

@jinchihe
Copy link
Member

@aicdp1010 please check your cluster, what the TFJob version installed in your cluster?

@shreya-verma19
Copy link

using v2.0.3 I fell on the issue you opened @jinchihe #1029 when I applied kustomize edit set image training-image=$IMAGE_PATH

I created a tag with export TAG=$(date +"%m-%d-%Y") and add it to the IMAGE_PATH like this export IMAGE_PATH=us.gcr.io/$PROJECT_ID/kubeflow-train:$TAG.

Then I was able to apply kustomize build . | kubectl apply -f - with success, the respose was a yaml file and below:

configmap/mnist-map-training-4c92g6hh6b unchanged
tfjob.kubeflow.org/my-train-1 unchanged

But the original "bug" seems so "simple"... the problem appears to be the trainingName variable, as pointed here. The kustomization.yaml requires trainingName variable and it is not being set anywhere, it seems. I would like to solve this problem by upgrading the tutorial to kustomize 3.1.0. Is it very hard? Could someone point me to the problem and where could I help?

Hi I tried to add trainingName but it already existed
kustomize edit add configmap mnist-map-training --from-literal=trainingName=test_kubeflow
Error: cannot add key trainingName, another key by that name already exists: map[GOOGLE_APPLICATION_CREDENTIALS:/var/secrets/user-gcp-sa.json batchSize:100 exportDir:gs://my_bucket_kubeflow/my-model/export learningRate:0.01 modelDir:gs://my_bucket_kubeflow/my-model name:my-train-1 secretMountPath:/var/secrets secretName:user-gcp-sa trainSteps:200 trainingName:test_kubeflow]

But when I am again trying to build it gives error

cloudshell:~/examples/mnist/training/GCS$ kustomize build
Error: no matches for OriginalId kubeflow.org_v1beta2_TFJob|~X|$(trainingName); no matches for CurrentId kubeflow.org_v1beta2_TFJob|~X|$(trainingName); failed to find unique target for patch kubeflow.org_v1beta2_TFJob|$(trainingName)

@jinchihe
Copy link
Member

jinchihe commented Oct 9, 2019

@shreya-verma19 you executed twice? if yes, need to remove the frist one, seems kustomize does not support rewriting the value.

@bruscar010
Copy link

bruscar010 commented Oct 31, 2019

hi @jinchihe

Following this guide here

configmap/mnist-map-training-45h47275m7 unchanged error: unable to recognize "STDIN": no matches for kind "TFJob" in version "kubeflow.org/v1beta2"
I've been reading through a couple of these threads and I can't find a solution. Looking for any advice on what I can do

Version: {KustomizeVersion:2.0.3 GitCommit:a6f65144121d1955266b0cd836ce954c04122dc8 BuildDate:2019-03-05T20:37:42Z GoOs:linux GoArch:amd64}

I ran kubectl describe crd tfjobs.kubeflow.org and this is what returned

Name:         tfjobs.kubeflow.org
Namespace:    
Labels:       app.kubernetes.io/component=tfjob
              app.kubernetes.io/instance=tf-job-crds-v0.7.0
              app.kubernetes.io/managed-by=kfctl
              app.kubernetes.io/name=tf-job-crds
              app.kubernetes.io/part-of=kubeflow
              app.kubernetes.io/version=v0.7.0
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiextensions.k8s.io/v1beta1","kind":"CustomResourceDefinition","metadata":{"annotations":{},"labels":{"app.kubernetes.io/c...
API Version:  apiextensions.k8s.io/v1beta1
Kind:         CustomResourceDefinition
Metadata:
  Creation Timestamp:  2019-10-29T23:53:03Z
  Generation:          1
  Resource Version:    2620
  Self Link:           /apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/tfjobs.kubeflow.org
  UID:                 3eb96b55-faa7-11e9-9ab9-42010a840fdc
Spec:
  Additional Printer Columns:
    JSON Path:  .status.conditions[-1:].type
    Name:       State
    Type:       string
    JSON Path:  .metadata.creationTimestamp
    Name:       Age
    Type:       date
  Conversion:
    Strategy:  None
  Group:       kubeflow.org
  Names:
    Kind:       TFJob
    List Kind:  TFJobList
    Plural:     tfjobs
    Singular:   tfjob
  Scope:        Namespaced
  Subresources:
    Status:
  Validation:
    openAPIV3Schema:
      Properties:
        Spec:
          Properties:
            Tf Replica Specs:
              Properties:
                Chief:
                  Properties:
                    Replicas:
                      Maximum:  1
                      Minimum:  1
                      Type:     integer
                PS:
                  Properties:
                    Replicas:
                      Minimum:  1
                      Type:     integer
                Worker:
                  Properties:
                    Replicas:
                      Minimum:  1
                      Type:     integer
  Version:                      v1
  Versions:
    Name:     v1
    Served:   true
    Storage:  true
Status:
  Accepted Names:
    Kind:       TFJob
    List Kind:  TFJobList
    Plural:     tfjobs
    Singular:   tfjob
  Conditions:
    Last Transition Time:  2019-10-29T23:53:03Z
    Message:               no conflicts found
    Reason:                NoConflicts
    Status:                True
    Type:                  NamesAccepted
    Last Transition Time:  <nil>
    Message:               the initial names have been accepted
    Reason:                InitialNamesAccepted
    Status:                True
    Type:                  Established
  Stored Versions:
    v1
Events:  <none>

@jinchihe
Copy link
Member

jinchihe commented Nov 1, 2019

You tfjob CRD is V1, so we need to update the tfjobs version to v1.

@bruscar010
Copy link

@jinchihe is that as simple as changing one of the yaml files from v1beta2 to v1?

@jinchihe
Copy link
Member

jinchihe commented Nov 1, 2019

@FrancisLennon17 I think need to update all yaml files, I will update this once get chance. thanks

@tarrade
Copy link

tarrade commented Nov 19, 2019

I got the same issues and I did the following (as mention aboved):

  • use PATH=${WORKING_DIR}/bin:$PATH so it will start with KustomizeVersion:2.0.3 and not some other version you have installed
  • in mnist/training/GCS/kustomization.yam replace: "ressources" by "bases" and it should look like:
bases: 
- ../base

you can also look at the correct YAML for serving as an example: mnist/serving/GCS/kustomization.yam

Then everything is working fine on my side.

@plaffitte
Copy link

@jinchihe I tried replacing all v1beta2 with v1 in all .yaml files but I get the same issue with v1:

unable to recognize "STDIN": no matches for kind "TFJob" in version "kubeflow.org/v1"

@dhfromkorea
Copy link

I had the same issue as reported by @plaffitte.

@paulkarayan
Copy link

paulkarayan commented Dec 24, 2019

based on @lucastheis 's situation, i downgraded to older Kustomize package
(you can do this will brew, following https://www.fernandomc.com/posts/brew-install-legacy-hugo-site-generator/)
brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/b4e8be2ffb5b9552901d802911dbae522654cf1e/Formula/kustomize.rb

then if i edit GCS/kustomization.yaml to replace "resources" with "bases" e.g.

images:
- name: training-image
  newTag: latest
bases:
- ../base
configMapGenerator:

etc

then this works. really obnoxious.

@jtfogarty
Copy link

/area example/mnist
/area kustomize
/priority p2

@stale
Copy link

stale bot commented Apr 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot closed this as completed Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests