The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

Meaning of -nt and -nct

hrbigelowhrbigelow San FranciscoMember Posts: 9
edited September 2013 in Ask the GATK team


According to

There are two options for multi-threading with the GATK, controlled by the arguments -nt and -nct, respectively, which can be combined:

-nt / --num_threads controls the number of data threads sent to the processor
-nct / --num_cpu_threads_per_data_thread controls the number of CPU threads allocated to each data thread

RHEL5, 144 GB memory, 12 cores (Intel 2.8 GHz)

~/src/jre1.7.0_40/bin/java -Xmx64g -Xms32g -d64 -jar /apps/gau/GATK_versions/GATKLite-2.1/GenomeAnalysisTKLite.jar -nt 8 -nct 6 -L chr8:90000001-120000000 -rbs 10000000 -T UnifiedGenotyper -rf BadCigar -R /dev/shm/CEUref.hg19.fasta -glm BOTH -D /dev/shm/dbsnp_135.hg19.reordered.vcf -metrics test.metrics.txt -stand_call_conf 30.0 -stand_emit_conf 10.0 -dcov 1000 --max_alternate_alleles 10 -A AlleleBalance -A AlleleBalanceBySample -A BaseCounts -A BaseQualityRankSumTest -A DepthOfCoverage -A DepthPerAlleleBySample -A FisherStrand -A HaplotypeScore -A HardyWeinberg -A IndelType -A LowMQ -A MappingQualityRankSumTest -A MappingQualityZero -A MappingQualityZeroBySample -A MappingQualityZeroFraction -A QualByDepth -A ReadPosRankSumTest -A RMSMappingQuality -A SampleList -o chunk55.vcf -I ./AC2181ACXX_DS-124072_GAGTGG_L006_001.markdup.fixed.left.recal.rehead.bam -I ./AC2181ACXX_DS-124113_GTCCGC_L001_001.markdup.fixed.left.recal.rehead.bam -I ./AC2181ACXX_DS-124080_ATTCCT_L008_001.markdup.fixed.left.recal.rehead.bam -I ./AD23GUACXX_DS-124122_AGTCAA_L005_001.markdup.fixed.left.recal.rehead.bam...

(with 116 bam files)

/usr/sbin/lsof -p <java process ID> | wc -l
returns 728 open files. (This turns out to be the 116 BAM files opened each of 8 times).

When run with the following settings, I see some strange messages:

-nt 12 -nct 1
INFO 16:57:41,908 SAMDataSource - Running in asynchronous I/O mode; number of threads = 11

-nt 12 -nct 2
INFO 16:58:46,246 SAMDataSource - Running in asynchronous I/O mode; number of threads = 10
INFO 16:58:46,763 MicroScheduler - Running the GATK in parallel mode with 2 concurrent threads

and so on, until:

-nt 12 -nct 11
INFO 17:00:07,673 SAMDataSource - Running in asynchronous I/O mode; number of threads = 1
INFO 17:00:08,288 MicroScheduler - Running the GATK in parallel mode with 11 concurrent threads

-nt 12 -nct 13
ERROR MESSAGE: Invalid thread allocation. User requested 12 threads in total, but the count of cpu threads (13) is higher than the total threads

If -nt is the 'number of data threads', then why does SAMDataSource report <nt> - <nct> as the number of 'threads'?

If -nct is the 'number of CPU threads per data thread', then the total number of CPU threads running should really be <nt> * <nct>. Instead, it seems to be <nt> - <nct>, which makes no sense according to the definitions. states:

Memory considerations for multi-threading
Each data thread needs to be given the full amount of memory you’d normally give a single run. So if you’re running a tool that normally requires 2 Gb of memory to run, if you use -nt 4, the multithreaded run will use 8 Gb of memory

In any case, all other software I'm familiar with has no notion of a 'data thread', and it seems unnecessary and wasteful -- one simply specifies the inputs, and chooses a number of CPU threads, and the program handles the rest, without reading the same input multiple times.





  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,677 admin

    Hey Henry, I just noticed from your other post that you are using an older version of GATK. Can you try this with version 2.7 and let me know if you see the same pattern with different settings of -nt and -nct?

    Geraldine Van der Auwera, PhD

  • tommycarstensentommycarstensen United KingdomMember Posts: 400 ✭✭✭

    Hello @hrbigelow . Did you ever try this out with a newer version of GATK by any chance?

Sign In or Register to comment.