-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NPE due to GCS auth not being passed in when reading BAM header. #2450
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2450 +/- ##
===============================================
+ Coverage 76.261% 76.267% +0.005%
Complexity 10866 10866
===============================================
Files 750 750
Lines 39560 39556 -4
Branches 6916 6916
===============================================
- Hits 30169 30168 -1
+ Misses 6771 6768 -3
Partials 2620 2620
Continue to review full report at Codecov.
|
public JavaRDD<GATKRead> getParallelReads(final String readFileName, final String referencePath, final List<SimpleInterval> intervals, final long splitSize) { | ||
SAMFileHeader header = getHeader(readFileName, referencePath, null); | ||
public JavaRDD<GATKRead> getParallelReads(final String readFileName, final String referencePath, final AuthHolder auth, final List<SimpleInterval> intervals, final long splitSize) { | ||
SAMFileHeader header = getHeader(readFileName, referencePath, auth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just convert getHeader()
to use an NIO Path in the GCS case, instead of passing around this AuthHolder
object and calling into BAMIO.openBAM()
from the dataflow library? If yes, I'd prefer to do that rather than spread use of the obsolete AuthHolder
class further.
@jean-philippe-martin do you concur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be nicer. Did we succeed at getting NIO working on Spark clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a getPath method that does the construction and works around the classpath issue, but there's currently a bug in the NIO code that makes it not work properly. See my stack trace below. I think you said that that bug is fixed already and we may just need to bump the version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I think that's in googleapis/google-cloud-java#1470. I see it was merged in February this year.
Back to @tomwhite with a question/request. |
@tomwhite I ran your branch manually on gcs and get a new error which I believe is a GCS NIO bug that we discussed in samtools/htsjdk#724.
|
8fd790f
to
2cd57bd
Compare
I agree about avoiding |
} catch (Exception e) { | ||
throw new UserException("Failed to read bam header from " + filePath + "\n Caused by:" + e.getMessage(), e); | ||
} | ||
return new ReadsDataSource(IOUtils.getPath(filePath)).getHeader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap the construction of the ReadsDataSource
within a try-with-resources block to ensure it's closed before we return the header.
Review complete, back to @tomwhite. Go ahead and merge after you've addressed my one comment. |
Oh, and please also create a ticket to fix the CRAM header reading support after this is merged. As discussed at our meeting today, all you have to do is provide a |
2cd57bd
to
05211ec
Compare
Opened #2463 for the CRAM issue. |
Fixes #2449.