If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
CNV disco hanging at stage 12
I’m running CNV discovery on 200 Drosophila whole genomes (30X, 140Mb each). The run goes for 3 minutes, and then hangs at stage 12, with the last log entry being:
INFO 18:46:48,044 DrmaaJobRunner - Submitted job id: 9188590 INFO 18:46:48,045 QGraph - 16832 Pend, 1870 Run, 0 Fail, 3 Done
I've attached my code and log. My sysadmin has set the server so jobs can be submitted automatically by genomestrip. I’ve set the code to do each of the major Drosophila scaffolds separately, using e.g "-L chr2L" because, i) There are a lot of small scaffolds which tend to cause problems, and ii) The "specify intervals" command doesn’t seem to work in this case.
The run hangs for at least 12 hours at stage 12. The created metadata and cnv_sentinel folders are empty. The stage 11 directory is populated with folders for each of the ~2000 scaffolds (including unwanted ones), and the stage 12 directory is empty.
I'm going to try this on smaller regions, and fiddle with a the window sizes a bit, but otherwise I really don't know where the problem is. My custom mask files and configuration data worked for the deletion pipeline. Any suggestions?