Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for prodigal gene-calling parallelization #1344

Closed
brymerr921 opened this issue Feb 4, 2020 · 2 comments
Closed

Suggestion for prodigal gene-calling parallelization #1344

brymerr921 opened this issue Feb 4, 2020 · 2 comments

Comments

@brymerr921
Copy link
Contributor

Hi anvi'o developers,

I was wondering if you could modify the steps that use prodigal so that they break the input file into many roughly-equal-sized pieces (one per CPU requested) and run it in an embarrasingly parallel fashion. This will allow users to make much better use of multi-CPU systems and speed up gene-calling steps all in one go! Thanks so much.

Best,
Bryan

@ekiefl
Copy link
Contributor

ekiefl commented Feb 4, 2020

Hey @brymerr921, do you think that gene calling is a bottleneck in speed? I predicted gene calls for approximately 350,000,000 million genes in less than 6 hours the other week.

EDIT: it turns out I am exaggerating, that was with external gene calls being processed. Do you have any data on how long this step is taking?

@meren
Copy link
Member

meren commented Jul 29, 2020

Hey @brymerr921, probably it is too late for your needs, but I still wanted to close this by mentioning that this is now addressed in master thanks to @mooreryan's recent PRs #1437, #1468, and #1445.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants