The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

Potential infinite loop in Queue

Hi,

I've been running Queue using the old DataProcessingPipeline.scala script (unmodified) for over a day now, and I'm starting to think there's an infinite loop. The output keeps adding Qnodes. Right now it's up to almost 1700000 QNodes. I've run the same script before with 1000 smaller bam files (about 1/2Gb each) and that seemed to work fine. Now I'm trying to run it on 1000 exomes, each on average 8Gb. I'm using the following options:

Here's the output:

INFO 11:09:17,548 HelpFormatter - ---------------------------------------------

INFO 11:09:17,548 HelpFormatter - Queue v2.7-4-g6f46d11, Compiled 2013/10/10 17
:29:52
INFO 11:09:17,548 HelpFormatter - Copyright (c) 2012 The Broad Institute
INFO 11:09:17,549 HelpFormatter - For support and documentation go to http://ww
w.broadinstitute.org/gatk
DEBUG 11:09:17,549 HelpFormatter - Current directory: /cluster/ifs/projects/Exom
es/Biesecker_CS_WE/mito_express/ES/tmp
INFO 11:09:17,549 HelpFormatter - Program Args: -S /home/singhln/Projects/ES/sr
c/DataProcessingPipeline.scala -bwa /home/singhln/bin/bwa -bt 10 -i /home/singhl
n/Projects/ES/Data/allalignedbams.list -R /home/singhln/Projects/ES/Data/human_g
1k_v37.fasta -D /home/singhln/Projects/ES/Data/dbsnp_137.b37.vcf -p CS -bwape -l
og cses.log -gv -qsub -startFromScratch -jobReport queuereport.txt -memLimit 4 -
tempDir /home/singhln/Projects/ES/tmp -runDir /home/singhln/Projects/ES/Dedup -l
DEBUG -run
INFO 11:09:17,550 HelpFormatter - Date/Time: 2013/12/03 11:09:17

INFO 11:09:17,550 HelpFormatter - ---------------------------------------------

INFO 11:09:17,550 HelpFormatter - ---------------------------------------------

INFO 11:09:17,561 QCommandLine - Scripting DataProcessingPipeline
DEBUG 11:10:46,667 QGraph - adding QNode: 0
DEBUG 11:10:46,859 QGraph - adding QNode: 100
DEBUG 11:10:47,024 QGraph - adding QNode: 200
DEBUG 11:10:47,136 QGraph - adding QNode: 300
DEBUG 11:10:47,241 QGraph - adding QNode: 400
...
...
DEBUG 10:54:46,948 QGraph - adding QNode: 1647200
DEBUG 10:55:11,294 QGraph - adding QNode: 1647300
DEBUG 10:55:23,623 QGraph - adding QNode: 1647400
DEBUG 10:55:40,081 QGraph - adding QNode: 1647500
DEBUG 10:55:52,678 QGraph - adding QNode: 1647600
DEBUG 10:56:02,486 QGraph - adding QNode: 1647700

I'm not even sure how I'd go about debugging this or if this is normal, but it does seem very strange to me. No output seems to have been created during the last 24 hours either, other than the log file.

Thanks for any help,
-Larry.

Tagged:

Best Answer

Answers

  • Just an update, so I ran my job on just 10 bam files and that completed. So I guess the jobs are being split too much? Is there a way to control how many qnodes are generated?

  • If I go by a linear scaling, then there should be about 2.4 million sub jobs, which is way too many. I'll give the sg parameter a try, and see what happens and get back to you. Thanks a lot!

  • So it seems that there were too many jobs being created by the scatter-gather, I had to set -sg down to about 4, for it to be okay. Reading the manual a little more closely, it seems that's the suggested amount. Thanks for the help pdexheimer.

Sign In or Register to comment.