Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade google-cloud-java to 0.30.0. #3855

Closed
wants to merge 1 commit into from

Conversation

cmnbroad
Copy link
Collaborator

No description provided.

@codecov-io
Copy link

Codecov Report

Merging #3855 into master will decrease coverage by 0.003%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master     #3855       +/-   ##
===============================================
- Coverage     79.442%   79.439%   -0.003%     
  Complexity     17829     17829               
===============================================
  Files           1168      1168               
  Lines          64346     64346               
  Branches        9823      9823               
===============================================
- Hits           51118     51116        -2     
- Misses          9310      9312        +2     
  Partials        3918      3918
Impacted Files Coverage Δ Complexity Δ
...stitute/hellbender/cmdline/CommandLineProgram.java 86% <100%> (ø) 29 <0> (ø) ⬇️
...e/hellbender/engine/spark/SparkContextFactory.java 71.233% <0%> (-2.74%) 11% <0%> (ø)

@droazen
Copy link
Contributor

droazen commented Nov 20, 2017

Let's test whether this fixes our longstanding auth issues on Spark before merging (#3591) @jean-philippe-martin would you have time to test?

@droazen droazen self-requested a review November 20, 2017 15:56
@droazen droazen self-assigned this Nov 20, 2017
@jean-philippe-martin
Copy link
Contributor

With a service account key set, it worked like a charm:

$ ./gatk-launch PrintReadsSpark -I gs://jpmartin-testing-project/hellbender-test-inputs/CEUTrio.HiSeq.WGS.b37.ch20.1m-2m.NA12878.bam -O gs://jpmartin-testing-project/test-output/readcount --shardedOutput true -- --sparkRunner GCS --cluster jps-test-cluster
(...)
[November 20, 2017 6:17:08 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.PrintReadsSpark done. Elapsed time: 0.72 minutes.
Runtime.totalMemory()=670040064
Job [13c93a62-96d0-456e-91d1-ef7b20f1236b] finished successfully.

Though I understand that this is expected.

So next I tried it without any HELLBEND* environment variable and it worked as well!

Job [6e2f2c6b-921a-4fdf-a42e-0706216b2098] finished successfully.
(...)
$ gsutil ls -lh gs://jpmartin-testing-project/test-output/readcount/
       0 B  2017-11-20T18:28:27Z  gs://jpmartin-testing-project/test-output/readcount/
       0 B  2017-11-20T18:28:52Z  gs://jpmartin-testing-project/test-output/readcount/_SUCCESS
120.25 MiB  2017-11-20T18:28:51Z  gs://jpmartin-testing-project/test-output/readcount/part-r-00000.bam

This is with GOOGLE_APPLICATION_CREDENTIALS set, as I believe is part of the GATK README instructions.

Next I went to my repro code and tried it again with v30. It failed (StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account.) I'm not sure why but the new version is certainly an improvement over the previous one since it fixes PrintReadsSpark.

@jean-philippe-martin
Copy link
Contributor

I tried CountReadsSpark and that also worked fine:

$ ./gatk-launch CountReadsSpark -I gs://$BUCKET/hellbender-test-inputs/CEUTrio.HiSeq.WGS.b37.ch20.1m-2m.NA12878.bam -O gs://$BUCKET/test-output/readcount_2 -- --sparkRunner GCS --cluster jps-test-cluster
[November 20, 2017 7:04:27 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.CountReadsSpark done. Elapsed time: 0.43 minutes.
Runtime.totalMemory()=653787136
Job [d9b686ed-3971-4494-b98b-336f751a449d] finished successfully.
(...)
$ gsutil cat gs://$BUCKET/test-output/readcount_2
836574

@droazen
Copy link
Contributor

droazen commented Nov 27, 2017

I tried this branch out and got the dreaded 404 error, unfortunately:

$ ./gatk-launch CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam -- --sparkRunner GCS --cluster droazen-test-cluster --executor-cores 2 --num-executors 2
Using GATK jar /Users/droazen/src/hellbender/build/libs/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar
jar caching is disabled because GATK_GCS_STAGING is not set

please set GATK_GCS_STAGING to a bucket you have write access too in order to enable jar caching
add the following line to you .bashrc or equivalent startup script

    export GATK_GCS_STAGING=gs://<my_bucket>/

Replacing spark-submit style args with dataproc style args

--cluster droazen-test-cluster --executor-cores 2 --num-executors 2 -> --cluster droazen-test-cluster --properties spark.driver.userClassPathFirst=true,spark.io.compression.codec=lzf,spark.driver.maxResultSize=0,spark.executor.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.driver.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.kryoserializer.buffer.max=512m,spark.yarn.executor.memoryOverhead=600,spark.executor.cores=2,spark.executor.instances=2

Running:
    gcloud dataproc jobs submit spark --cluster droazen-test-cluster --properties spark.driver.userClassPathFirst=true,spark.io.compression.codec=lzf,spark.driver.maxResultSize=0,spark.executor.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.driver.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.kryoserializer.buffer.max=512m,spark.yarn.executor.memoryOverhead=600,spark.executor.cores=2,spark.executor.instances=2 --jar /Users/droazen/src/hellbender/build/libs/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar -- CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam --sparkMaster yarn
Job [acdae2af-e0ce-4822-87f5-dcd165d85cf4] submitted.
Waiting for job output...
20:39:42.869 WARN  SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
20:39:43.053 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/tmp/acdae2af-e0ce-4822-87f5-dcd165d85cf4/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar!/com/intel/gkl/native/libgkl_compression.so
[November 27, 2017 8:39:43 PM UTC] CountReadsSpark  --input gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam --sparkMaster yarn  --readValidationStringency SILENT --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --interval_merging_rule ALL --bamPartitionSize 0 --disableSequenceDictionaryValidation false --shardedOutput false --numReducers 0 --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --gcs_max_retries 20 --disableToolDefaultReadFilters false
[November 27, 2017 8:39:43 PM UTC] Executing as root@droazen-test-cluster-m on Linux 3.16.0-4-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_131-8u131-b11-1~bpo8+1-b11; Version: 4.beta.6-54-g0ee99da-SNAPSHOT
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 1
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
20:39:43.245 INFO  CountReadsSpark - Deflater: IntelDeflater
20:39:43.245 INFO  CountReadsSpark - Inflater: IntelInflater
20:39:43.245 INFO  CountReadsSpark - GCS max retries/reopens: 20
20:39:43.245 INFO  CountReadsSpark - Using google-cloud-java: 0.30.0-alpha
20:39:43.245 INFO  CountReadsSpark - Initializing engine
20:39:43.245 INFO  CountReadsSpark - Done initializing engine
17/11/27 20:39:44 INFO org.spark_project.jetty.util.log: Logging initialized @3893ms
17/11/27 20:39:44 INFO org.spark_project.jetty.server.Server: jetty-9.3.z-SNAPSHOT
17/11/27 20:39:44 INFO org.spark_project.jetty.server.Server: Started @3988ms
17/11/27 20:39:44 INFO org.spark_project.jetty.server.AbstractConnector: Started ServerConnector@7fbe38a{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
17/11/27 20:39:44 INFO com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase: GHFS version: 1.6.1-hadoop2
17/11/27 20:39:45 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at droazen-test-cluster-m/10.240.0.10:8032
17/11/27 20:39:47 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1511814592376_0002
17/11/27 20:39:52 INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@7fbe38a{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20:39:52.363 INFO  CountReadsSpark - Shutting down engine
[November 27, 2017 8:39:52 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.CountReadsSpark done. Elapsed time: 0.16 minutes.
Runtime.totalMemory()=630718464
code:      0
message:   Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
reason:    null
location:  null
retryable: false
com.google.cloud.storage.StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:189)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:340)
	at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:197)
	at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:194)
	at shaded.cloud_nio.com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:74)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51)
	at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:194)
	at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.checkAccess(CloudStorageFileSystemProvider.java:614)
	at java.nio.file.Files.exists(Files.java:2385)
	at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:355)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:206)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:162)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:118)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:87)
	at org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource.getHeader(ReadsSparkSource.java:182)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeReads(GATKSparkTool.java:390)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeToolInputs(GATKSparkTool.java:370)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:360)
	at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:137)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:158)
	at org.broadinstitute.hellbender.Main.main(Main.java:239)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
	at shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials.refreshAccessToken(ComputeEngineCredentials.java:152)
	at shaded.cloud_nio.com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:175)
	at shaded.cloud_nio.com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:161)
	at shaded.cloud_nio.com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
	at com.google.cloud.http.HttpTransportOptions$1.initialize(HttpTransportOptions.java:157)
	at shaded.cloud_nio.com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:338)
	... 33 more
ERROR: (gcloud.dataproc.jobs.submit.spark) Job [acdae2af-e0ce-4822-87f5-dcd165d85cf4] entered state [ERROR] while waiting for [DONE].

@jean-philippe-martin
Copy link
Contributor

@droazen , I was able to reproduce your result. I tried to isolate what made it work or not.

I tried with two kinds of inputs:

  • on the hellbender bucket, or
  • on my own bucket

I tried with two choices for GOOGLE_APPLICATION_CREDENTIALS:

  • default credentials, or
  • my own

I tried with two different clusters:

  • one created in the Broad project, or
  • one created in my own project.

With every one of those eight combinations, I got the same result: the dreaded "Error code 404 trying to get security access token from Compute Engine metadata for the default service account."

./gatk-launch CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam -- --sparkRunner GCS --cluster jp-test-cluster --executor-cores 2 --num-executors 2
com.google.cloud.storage.StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.

@droazen
Copy link
Contributor

droazen commented May 10, 2018

I tried again just now with the latest google-cloud-java (0.47.0-alpha:shaded) and the latest Dataproc image (1.2.34) and got the same error. I've asked Google to reopen googleapis/google-cloud-java#2453

@droazen
Copy link
Contributor

droazen commented Aug 24, 2018

Closing this one. The next release of google-cloud-java will fix our longstanding 404 errors, so we'll update to that release when it's out.

@droazen droazen closed this Aug 24, 2018
@droazen droazen deleted the cn_upgrade_google_cloud_java branch August 24, 2018 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants