Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

equal treatment for supplementary alignments #137

Merged
merged 3 commits into from
May 21, 2020
Merged

equal treatment for supplementary alignments #137

merged 3 commits into from
May 21, 2020

Conversation

tedsharpe
Copy link
Contributor

Allow no sequence in supplementary alignments, just as we do for secondary alignments.

Copy link
Contributor

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. How was this problem manifesting? Bad split guesses? Is there a reasonable test we could add to show that it's fixed now?

@tedsharpe
Copy link
Contributor Author

Problem manifests when supplementary alignments have no sequence: guesser doesn't recognize a valid record start. This change just parallels an earlier commit that allowed no-sequence secondary alignments, so that supplementaries now behave the same way.
Tweaked data to demonstrate validity of fix.

pom.xml Outdated
@@ -4,7 +4,7 @@

<groupId>org.disq-bio</groupId>
<artifactId>disq</artifactId>
<version>0.3.6-SNAPSHOT</version>
<version>0.3.7-SNAPSHOT</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you accidentally included the snapshot version change here. Could you revert that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but I thought we'd want to bump the version with this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets bumped automatically as part of the release process. If we bump it manually we'll end up skipping a version.

@lbergelson
Copy link
Contributor

@tedsharpe Thanks for amending the test data. What happens without this change? Does it break the split guessing so it eventually runs off the end of the file without find a split? Or a crash, or a bad split that causes data corruption? I'm just curious.

@tedsharpe
Copy link
Contributor Author

Assuming no splitting index: If the density of supplementaries without sequence is high enough that you never get a run of 10 consecutive primary lines in some partition, the partition will be empty.
It's a weird edge case, but the bams I make from local assemblies in SV regions are queryname sorted, and have a lot of supplementaries. That triggers this ugliness.

@lbergelson
Copy link
Contributor

just bump back the version and good to merge

Copy link
Contributor

@heuermh heuermh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @tedsharpe!

@heuermh heuermh merged commit f6f1c8d into disq-bio:master May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants