Is --native-pair-hmm-threads in GATK 22.214.171.124 same as -nct in older GATK versions?
I did not find anything to assign number of threads to be used for the job (ex haplotypecaller)..I beleive it was removed or named something else..did not specifically find anything in haplotycaller help options..
You have to use Spark versions of the tools available to use multithreaded acceleration. Other than that there is no multi threading option available. Spark versions of the tools can be run locally without the need of a cluster.
what do you mean "Spark versions of the tools"... any link or example?
The Spark tools are documented in the tool docs.
When I use --native-pair-hmm-threads as 16, I know that 16 threads will be used for pair HMM calculation. What was -nct doing in the previous versions then (which step of haplotypecaller was it making fast), before it was removed in the new version?
I hope this article will help.
I'm following the RNAseq protocol using GATK4; (sunflower rnaseq samples).
Right now I can't make Picard MarkDuplicates, AddOrReplaceReadGroups and GATK SplitNCigarReads work multithreading. The options "NUM_PROCESSORS" and -nt/-nct (respectively) crash the commands.
Yet, if I omit those options the programs work. So, I run multiple samples at once to save time; but I don't like it, that is what the options are for.
Thanks in advance!
GATK4 uses apache spark for multithreading however even without -nt and -nct options speed gains are noticable compared to GATK3. BQSR is especially faster with GATK4 single threaded vs GATK3 multithreaded.
This change was necessary to maintain stable and clean codebase. If you would like to stick with old nt and nct you may continue with GATK3 for at least a foreseeable future.
As far as I know Picard tools do not use multithreading so they are out of discussion.
I'm not finding any instructions on how to parallelize Mutect2 processing of WGS data, could you help me out?
You could split by chromosome.