The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at

Potential infinite loop in Queue

larrynslarryns Member Posts: 5


I've been running Queue using the old DataProcessingPipeline.scala script (unmodified) for over a day now, and I'm starting to think there's an infinite loop. The output keeps adding Qnodes. Right now it's up to almost 1700000 QNodes. I've run the same script before with 1000 smaller bam files (about 1/2Gb each) and that seemed to work fine. Now I'm trying to run it on 1000 exomes, each on average 8Gb. I'm using the following options:

Here's the output:

INFO 11:09:17,548 HelpFormatter - ---------------------------------------------

INFO 11:09:17,548 HelpFormatter - Queue v2.7-4-g6f46d11, Compiled 2013/10/10 17
INFO 11:09:17,548 HelpFormatter - Copyright (c) 2012 The Broad Institute
INFO 11:09:17,549 HelpFormatter - For support and documentation go to http://ww
DEBUG 11:09:17,549 HelpFormatter - Current directory: /cluster/ifs/projects/Exom
INFO 11:09:17,549 HelpFormatter - Program Args: -S /home/singhln/Projects/ES/sr
c/DataProcessingPipeline.scala -bwa /home/singhln/bin/bwa -bt 10 -i /home/singhl
n/Projects/ES/Data/allalignedbams.list -R /home/singhln/Projects/ES/Data/human_g
1k_v37.fasta -D /home/singhln/Projects/ES/Data/dbsnp_137.b37.vcf -p CS -bwape -l
og cses.log -gv -qsub -startFromScratch -jobReport queuereport.txt -memLimit 4 -
tempDir /home/singhln/Projects/ES/tmp -runDir /home/singhln/Projects/ES/Dedup -l
DEBUG -run
INFO 11:09:17,550 HelpFormatter - Date/Time: 2013/12/03 11:09:17

INFO 11:09:17,550 HelpFormatter - ---------------------------------------------

INFO 11:09:17,550 HelpFormatter - ---------------------------------------------

INFO 11:09:17,561 QCommandLine - Scripting DataProcessingPipeline
DEBUG 11:10:46,667 QGraph - adding QNode: 0
DEBUG 11:10:46,859 QGraph - adding QNode: 100
DEBUG 11:10:47,024 QGraph - adding QNode: 200
DEBUG 11:10:47,136 QGraph - adding QNode: 300
DEBUG 11:10:47,241 QGraph - adding QNode: 400
DEBUG 10:54:46,948 QGraph - adding QNode: 1647200
DEBUG 10:55:11,294 QGraph - adding QNode: 1647300
DEBUG 10:55:23,623 QGraph - adding QNode: 1647400
DEBUG 10:55:40,081 QGraph - adding QNode: 1647500
DEBUG 10:55:52,678 QGraph - adding QNode: 1647600
DEBUG 10:56:02,486 QGraph - adding QNode: 1647700

I'm not even sure how I'd go about debugging this or if this is normal, but it does seem very strange to me. No output seems to have been created during the last 24 hours either, other than the log file.

Thanks for any help,


Best Answer


  • larrynslarryns Member Posts: 5

    Just an update, so I ran my job on just 10 bam files and that completed. So I guess the jobs are being split too much? Is there a way to control how many qnodes are generated?

  • larrynslarryns Member Posts: 5

    If I go by a linear scaling, then there should be about 2.4 million sub jobs, which is way too many. I'll give the sg parameter a try, and see what happens and get back to you. Thanks a lot!

  • larrynslarryns Member Posts: 5

    So it seems that there were too many jobs being created by the scatter-gather, I had to set -sg down to about 4, for it to be okay. Reading the manual a little more closely, it seems that's the suggested amount. Thanks for the help pdexheimer.

Sign In or Register to comment.