-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak or stall? #40
Comments
Hi this may be a strange thing to assess but would you be able to try running the same thing without thank you! |
Thanks for your quick response! Running without |
Hm that's very strange indeed. I had a suspicion that would it be too much to ask if you could either share the entire dataset or a small subset that causes the problem on your machine? I'd be very interested in seeing if I can reproduce too, this may be more complicated than I thought. I understand that it's tricky to ask those things for private data so no worries if you cannot, or if it's more convenient you can always send it to me at desenabr[at]usc[dot]edu. finally - and this is admittedly a hail mary - does the program also fail if you don't use any flags? Just Thank you! |
@schorlton If you are only able to share part of the dataset, please include something with the same structure for the read names unless they are really short and contiguous. Also, try to include any special characters that would be in the original file, for example by obtaining the smaller part with command line tools. Also, the most helpful part would have similar variability in lengths of reads. Thanks! |
Running |
I can report that, using the provided file:
If @guilhermesena1 notices the same, we will need to find the problem and update conda. In the meantime, @schorlton if you are in an environment where you can build |
Glad I'm not going crazy 😝 |
Problem identified. Hopefully we can have a fix soon on conda. If you need it before we can do that, you'll have to try building from source -- preferably the release I linked above. |
The problem can also be reproduced on a clone/release if we delete/rename the files in the @schorlton if you need a quick fix you can point to the configuration files in conda as follows
I think when compiling falco internally through conda (since they only leave the binary), they're still not finding the Configuration folder under |
I think I found the source of the problem and pushed a possible fix at bca5f11 When we used the default values the shortest adapter size wasn't being set properly, and decreasing 1 lead to underflow. You should also be able to run the code just using the Sorry for the inconvenience and thank you again for sharing the issue with us! |
Thanks for the quick fix! Will test out the conda update once released. |
Unfortunately now getting a new error on 1.2.1 from bioconda: falco -skip-summary -skip-report nanopore.fastq.gz
[limitst] using file /opt/conda/opt/falco-1.2.1/Configuration/limits.txt
instruction for limit duplication not found in file /opt/conda/opt/falco-1.2.1 |
yep trying to figure this one out.
it runs ok, but not if you don't specify the file, even if the URL is correct. |
alright, I know better than to get my hopes up, but we made some changes to the most recent release and merged them to conda to address the issue. As far as I can tell, a conda download of falco puts the config files at |
It works - thank you! The only remaining issue is specifying the file format. Specifying |
We will create another issue to address this. Ideally fq and fq.gz should be treated equally, but the reading and process is a bit different, and technically calling I'll close this issue considering that the initial problem has been resolved, but any further problems regarding the stalling can always be added by reopening this. |
Thanks for falco!
Running v1.2.0 installed from bioconda on a nanopore FASTQ
It looks like it generates the
fastqc_data.txt
properly, but then after that, it consumes over 32GB of RAM over several minutes until I kill it...Is this expected? For comparison, FastQC processes the file in 18s within 1GB RAM.
Here are the stats of the file:
seqkit stats nanopore.fastq.gz file format type num_seqs sum_len min_len avg_len max_len nanopore.fastq.gz FASTQ DNA 30,720 204,534,138 100 6,658 35,768
The text was updated successfully, but these errors were encountered: