Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ multi readgroup input #59

Merged
merged 13 commits into from
Nov 5, 2024
Merged

✨ multi readgroup input #59

merged 13 commits into from
Nov 5, 2024

Conversation

dmiller15
Copy link
Contributor

@dmiller15 dmiller15 commented Oct 22, 2024

Description

This update brings multi-read group processing to RNAseq.

The workflow can now handle multiple inputs in the form of lists:

  • input_aligned_reads
  • input_pe_reads
  • input_pe_mates
  • input_se_reads

Aligned reads inputs have their @RG headers pulled and used for processing. Otherwise read groups inputs are also provided in list format using:

  • input_pe_rg_strs
  • input_se_rg_strs

These lists are converted to a record schema I have defined externally and import throughout the workflow. STAR and Kallisto had their inputs updated to take this record input to assist in building the command line.

This approach mirrors what we have in the DNAseq alignment workflows. Many of the new tools are stolen directly from the alignment workflow repo.

Closes https://d3b.atlassian.net/browse/BIXU-3780

Type of change

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Test Configuration:

  • Environment:
  • Test files:

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings
  • I have committed any related changes to the PR

🐛 fix custom RG DS string
🔧 add kallisto multi-rg input
🔧 star solo file manifest input, output, log
🚧 centralize record schema
🐛 outputbasename prefix to cutadapt stats
📚 update readme
@dmiller15 dmiller15 added enhancement New feature or request bix-dev This issue or pull request is bix-dev work labels Oct 22, 2024
@dmiller15 dmiller15 self-assigned this Oct 22, 2024
Copy link
Member

@migbro migbro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a behemoth! Impressive work. Just a couple minor typos I saw, but no obvious flaws in logic that I could see

dmiller15 and others added 2 commits October 22, 2024 14:45
Co-authored-by: Miguel Brown <miguel.a.brown@gmail.com>
@dmiller15 dmiller15 merged commit ac1f938 into master Nov 5, 2024
1 check passed
@dmiller15 dmiller15 deleted the dm-multi-readgroup-input branch November 5, 2024 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bix-dev This issue or pull request is bix-dev work enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants