Upgrade google-cloud-java to 0.30.0. #3855

cmnbroad · 2017-11-18T02:03:53Z

No description provided.

codecov-io · 2017-11-18T02:55:37Z

Codecov Report

Merging #3855 into master will decrease coverage by 0.003%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master     #3855       +/-   ##
===============================================
- Coverage     79.442%   79.439%   -0.003%     
  Complexity     17829     17829               
===============================================
  Files           1168      1168               
  Lines          64346     64346               
  Branches        9823      9823               
===============================================
- Hits           51118     51116        -2     
- Misses          9310      9312        +2     
  Partials        3918      3918

Impacted Files	Coverage Δ	Complexity Δ
...stitute/hellbender/cmdline/CommandLineProgram.java	`86% <100%> (ø)`	`29 <0> (ø)`	⬇️
...e/hellbender/engine/spark/SparkContextFactory.java	`71.233% <0%> (-2.74%)`	`11% <0%> (ø)`

droazen · 2017-11-20T15:44:59Z

Let's test whether this fixes our longstanding auth issues on Spark before merging (#3591) @jean-philippe-martin would you have time to test?

jean-philippe-martin · 2017-11-20T18:40:00Z

With a service account key set, it worked like a charm:

$ ./gatk-launch PrintReadsSpark -I gs://jpmartin-testing-project/hellbender-test-inputs/CEUTrio.HiSeq.WGS.b37.ch20.1m-2m.NA12878.bam -O gs://jpmartin-testing-project/test-output/readcount --shardedOutput true -- --sparkRunner GCS --cluster jps-test-cluster
(...)
[November 20, 2017 6:17:08 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.PrintReadsSpark done. Elapsed time: 0.72 minutes.
Runtime.totalMemory()=670040064
Job [13c93a62-96d0-456e-91d1-ef7b20f1236b] finished successfully.

Though I understand that this is expected.

So next I tried it without any HELLBEND* environment variable and it worked as well!

Job [6e2f2c6b-921a-4fdf-a42e-0706216b2098] finished successfully.
(...)
$ gsutil ls -lh gs://jpmartin-testing-project/test-output/readcount/
       0 B  2017-11-20T18:28:27Z  gs://jpmartin-testing-project/test-output/readcount/
       0 B  2017-11-20T18:28:52Z  gs://jpmartin-testing-project/test-output/readcount/_SUCCESS
120.25 MiB  2017-11-20T18:28:51Z  gs://jpmartin-testing-project/test-output/readcount/part-r-00000.bam

This is with GOOGLE_APPLICATION_CREDENTIALS set, as I believe is part of the GATK README instructions.

Next I went to my repro code and tried it again with v30. It failed (StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account.) I'm not sure why but the new version is certainly an improvement over the previous one since it fixes PrintReadsSpark.

jean-philippe-martin · 2017-11-20T19:07:05Z

I tried CountReadsSpark and that also worked fine:

$ ./gatk-launch CountReadsSpark -I gs://$BUCKET/hellbender-test-inputs/CEUTrio.HiSeq.WGS.b37.ch20.1m-2m.NA12878.bam -O gs://$BUCKET/test-output/readcount_2 -- --sparkRunner GCS --cluster jps-test-cluster
[November 20, 2017 7:04:27 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.CountReadsSpark done. Elapsed time: 0.43 minutes.
Runtime.totalMemory()=653787136
Job [d9b686ed-3971-4494-b98b-336f751a449d] finished successfully.
(...)
$ gsutil cat gs://$BUCKET/test-output/readcount_2
836574

droazen · 2017-11-27T20:40:41Z

I tried this branch out and got the dreaded 404 error, unfortunately:

$ ./gatk-launch CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam -- --sparkRunner GCS --cluster droazen-test-cluster --executor-cores 2 --num-executors 2
Using GATK jar /Users/droazen/src/hellbender/build/libs/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar
jar caching is disabled because GATK_GCS_STAGING is not set

please set GATK_GCS_STAGING to a bucket you have write access too in order to enable jar caching
add the following line to you .bashrc or equivalent startup script

    export GATK_GCS_STAGING=gs://<my_bucket>/

Replacing spark-submit style args with dataproc style args

--cluster droazen-test-cluster --executor-cores 2 --num-executors 2 -> --cluster droazen-test-cluster --properties spark.driver.userClassPathFirst=true,spark.io.compression.codec=lzf,spark.driver.maxResultSize=0,spark.executor.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.driver.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.kryoserializer.buffer.max=512m,spark.yarn.executor.memoryOverhead=600,spark.executor.cores=2,spark.executor.instances=2

Running:
    gcloud dataproc jobs submit spark --cluster droazen-test-cluster --properties spark.driver.userClassPathFirst=true,spark.io.compression.codec=lzf,spark.driver.maxResultSize=0,spark.executor.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.driver.extraJavaOptions=-DGATK_STACKTRACE_ON_USER_EXCEPTION=true -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=false -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 ,spark.kryoserializer.buffer.max=512m,spark.yarn.executor.memoryOverhead=600,spark.executor.cores=2,spark.executor.instances=2 --jar /Users/droazen/src/hellbender/build/libs/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar -- CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam --sparkMaster yarn
Job [acdae2af-e0ce-4822-87f5-dcd165d85cf4] submitted.
Waiting for job output...
20:39:42.869 WARN  SparkContextFactory - Environment variables HELLBENDER_TEST_PROJECT and HELLBENDER_JSON_SERVICE_ACCOUNT_KEY must be set or the GCS hadoop connector will not be configured properly
20:39:43.053 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/tmp/acdae2af-e0ce-4822-87f5-dcd165d85cf4/gatk-package-4.beta.6-54-g0ee99da-SNAPSHOT-spark.jar!/com/intel/gkl/native/libgkl_compression.so
[November 27, 2017 8:39:43 PM UTC] CountReadsSpark  --input gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam --sparkMaster yarn  --readValidationStringency SILENT --interval_set_rule UNION --interval_padding 0 --interval_exclusion_padding 0 --interval_merging_rule ALL --bamPartitionSize 0 --disableSequenceDictionaryValidation false --shardedOutput false --numReducers 0 --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false --gcs_max_retries 20 --disableToolDefaultReadFilters false
[November 27, 2017 8:39:43 PM UTC] Executing as root@droazen-test-cluster-m on Linux 3.16.0-4-amd64 amd64; OpenJDK 64-Bit Server VM 1.8.0_131-8u131-b11-1~bpo8+1-b11; Version: 4.beta.6-54-g0ee99da-SNAPSHOT
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 1
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : false
20:39:43.245 INFO  CountReadsSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
20:39:43.245 INFO  CountReadsSpark - Deflater: IntelDeflater
20:39:43.245 INFO  CountReadsSpark - Inflater: IntelInflater
20:39:43.245 INFO  CountReadsSpark - GCS max retries/reopens: 20
20:39:43.245 INFO  CountReadsSpark - Using google-cloud-java: 0.30.0-alpha
20:39:43.245 INFO  CountReadsSpark - Initializing engine
20:39:43.245 INFO  CountReadsSpark - Done initializing engine
17/11/27 20:39:44 INFO org.spark_project.jetty.util.log: Logging initialized @3893ms
17/11/27 20:39:44 INFO org.spark_project.jetty.server.Server: jetty-9.3.z-SNAPSHOT
17/11/27 20:39:44 INFO org.spark_project.jetty.server.Server: Started @3988ms
17/11/27 20:39:44 INFO org.spark_project.jetty.server.AbstractConnector: Started ServerConnector@7fbe38a{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
17/11/27 20:39:44 INFO com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase: GHFS version: 1.6.1-hadoop2
17/11/27 20:39:45 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at droazen-test-cluster-m/10.240.0.10:8032
17/11/27 20:39:47 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1511814592376_0002
17/11/27 20:39:52 INFO org.spark_project.jetty.server.AbstractConnector: Stopped Spark@7fbe38a{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20:39:52.363 INFO  CountReadsSpark - Shutting down engine
[November 27, 2017 8:39:52 PM UTC] org.broadinstitute.hellbender.tools.spark.pipelines.CountReadsSpark done. Elapsed time: 0.16 minutes.
Runtime.totalMemory()=630718464
code:      0
message:   Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
reason:    null
location:  null
retryable: false
com.google.cloud.storage.StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:189)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:340)
	at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:197)
	at com.google.cloud.storage.StorageImpl$5.call(StorageImpl.java:194)
	at shaded.cloud_nio.com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:74)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51)
	at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:194)
	at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.checkAccess(CloudStorageFileSystemProvider.java:614)
	at java.nio.file.Files.exists(Files.java:2385)
	at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:355)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:206)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:162)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:118)
	at org.broadinstitute.hellbender.engine.ReadsDataSource.<init>(ReadsDataSource.java:87)
	at org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource.getHeader(ReadsSparkSource.java:182)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeReads(GATKSparkTool.java:390)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeToolInputs(GATKSparkTool.java:370)
	at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:360)
	at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:38)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:119)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:176)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:195)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:137)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:158)
	at org.broadinstitute.hellbender.Main.main(Main.java:239)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.
	at shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials.refreshAccessToken(ComputeEngineCredentials.java:152)
	at shaded.cloud_nio.com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:175)
	at shaded.cloud_nio.com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:161)
	at shaded.cloud_nio.com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
	at com.google.cloud.http.HttpTransportOptions$1.initialize(HttpTransportOptions.java:157)
	at shaded.cloud_nio.com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
	at shaded.cloud_nio.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:338)
	... 33 more
ERROR: (gcloud.dataproc.jobs.submit.spark) Job [acdae2af-e0ce-4822-87f5-dcd165d85cf4] entered state [ERROR] while waiting for [DONE].

jean-philippe-martin · 2017-12-16T00:57:07Z

@droazen , I was able to reproduce your result. I tried to isolate what made it work or not.

I tried with two kinds of inputs:

on the hellbender bucket, or
on my own bucket

I tried with two choices for GOOGLE_APPLICATION_CREDENTIALS:

default credentials, or
my own

I tried with two different clusters:

one created in the Broad project, or
one created in my own project.

With every one of those eight combinations, I got the same result: the dreaded "Error code 404 trying to get security access token from Compute Engine metadata for the default service account."

./gatk-launch CountReadsSpark -I gs://hellbender/test/resources/large/CEUTrio.HiSeq.WGS.b37.NA12878.20.21.bam -- --sparkRunner GCS --cluster jp-test-cluster --executor-cores 2 --num-executors 2
com.google.cloud.storage.StorageException: Error code 404 trying to get security access token from Compute Engine metadata for the default service account. This may be because the virtual machine instance does not have permission scopes specified.

droazen · 2018-05-10T16:53:43Z

I tried again just now with the latest google-cloud-java (0.47.0-alpha:shaded) and the latest Dataproc image (1.2.34) and got the same error. I've asked Google to reopen googleapis/google-cloud-java#2453

droazen · 2018-08-24T19:02:20Z

Closing this one. The next release of google-cloud-java will fix our longstanding 404 errors, so we'll update to that release when it's out.

Upgrade google-cloud-java to 0.30.0.

0ee99da

droazen self-requested a review November 20, 2017 15:56

droazen self-assigned this Nov 20, 2017

lbergelson mentioned this pull request Dec 21, 2017

Authentication error after upgrading to 0.23.1 googleapis/google-cloud-java#2453

Closed

droazen closed this Aug 24, 2018

droazen deleted the cn_upgrade_google_cloud_java branch August 24, 2018 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade google-cloud-java to 0.30.0. #3855

Upgrade google-cloud-java to 0.30.0. #3855

cmnbroad commented Nov 18, 2017

codecov-io commented Nov 18, 2017

droazen commented Nov 20, 2017

jean-philippe-martin commented Nov 20, 2017

jean-philippe-martin commented Nov 20, 2017

droazen commented Nov 27, 2017

jean-philippe-martin commented Dec 16, 2017

droazen commented May 10, 2018

droazen commented Aug 24, 2018

Upgrade google-cloud-java to 0.30.0. #3855

Upgrade google-cloud-java to 0.30.0. #3855

Conversation

cmnbroad commented Nov 18, 2017

codecov-io commented Nov 18, 2017

Codecov Report

droazen commented Nov 20, 2017

jean-philippe-martin commented Nov 20, 2017

jean-philippe-martin commented Nov 20, 2017

droazen commented Nov 27, 2017

jean-philippe-martin commented Dec 16, 2017

droazen commented May 10, 2018

droazen commented Aug 24, 2018