To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Code exception in GenotypeGVCFs

Hi all,
I meet some error while generate vcf files with GenotypeGVCFs, please help~~ Thanks a lot!
my CMD:
java -Xmx60g -jar GenomeAnalysisTK-nightly-2016-09-23-gfade77f/GenomeAnalysisTK.jar -T GenotypeGVCFs -R /human_g1k_v37_decoy.fasta -I Human_9.final.bam -nt 6 -V sample1.g.vcf -V sample2.g.vcf ...-o merged.vcf

the ERROR:
...
INFO 08:38:51,783 ProgressMeter - 3:80195101 5.69449994E8 20.6 h 2.2 m 18.3% 4.7 d 92.1 h
INFO 08:39:51,784 ProgressMeter - 3:81012201 5.70449994E8 20.6 h 2.2 m 18.3% 4.7 d 92.0 h
INFO 08:40:51,785 ProgressMeter - 3:82154201 5.71449994E8 20.6 h 2.2 m 18.4% 4.7 d 91.8 h
INFO 08:41:51,786 ProgressMeter - 3:82382201 5.71449994E8 20.7 h 2.2 m 18.4% 4.7 d 91.9 h

ERROR --
ERROR stack trace

java.lang.NullPointerException
at java.util.LinkedList$ListItr.next(LinkedList.java:893)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.coveredByDeletion(GenotypingEngine.java:426)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateOutputAlleleSubset(GenotypingEngine.java:387)
at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:251)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:392)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:375)
at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:330)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:311)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:289)
at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:132)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version nightly-2016-09-23-gfade77f):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @fwks
    Hi,

    First, try removing -I Human_9.final.bam from your command. GenotypeGVCFs cannot accept BAM files, it only accepts GVCFs.

    -Sheila

  • fwksfwks chinaMember

    sorry,In fact,My CMD is:
    java -Xmx60g -jar GenomeAnalysisTK-nightly-2016-09-23-gfade77f/GenomeAnalysisTK.jar -T GenotypeGVCFs -R /human_g1k_v37_decoy.fasta -nt 6 -V sample1.g.vcf -V sample2.g.vcf ...-o merged.vcf

    The error
    ...
    INFO 08:38:51,783 ProgressMeter - 3:80195101 5.69449994E8 20.6 h 2.2 m 18.3% 4.7 d 92.1 h
    INFO 08:39:51,784 ProgressMeter - 3:81012201 5.70449994E8 20.6 h 2.2 m 18.3% 4.7 d 92.0 h
    INFO 08:40:51,785 ProgressMeter - 3:82154201 5.71449994E8 20.6 h 2.2 m 18.4% 4.7 d 91.8 h
    INFO 08:41:51,786 ProgressMeter - 3:82382201 5.71449994E8 20.7 h 2.2 m 18.4% 4.7 d 91.9 h

    ERROR --
    ERROR stack trace
    java.lang.NullPointerException
    at java.util.LinkedList$ListItr.next(LinkedList.java:893)
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.coveredByDeletion(GenotypingEngine.java:426)
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateOutputAlleleSubset(GenotypingEngine.java:387)
    at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:251)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:392)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:375)
    at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:330)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:311)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:289)
    at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:132)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
    at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
    at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version nightly-2016-09-23-gfade77f):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions https://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Code exception (see stack trace for error itself)
    ERROR ------------------------------------------------------------------------------------------

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @fwks
    Hi,

    Can you try without -nt 6?

    Thanks,
    Sheila

  • mkschustermkschuster CeMM, Vienna, AustriaMember

    Dear Sheila,

    I encountered the same NullPointerException when running GenotypeGVCFs from nightly build nightly-2016-09-23-gfade77f with option num_threads 4. However, when running with a single thread, the process finishes without exception. So as you suggested, this seems linked to multi-threading.

    Thank you for your help,
    Michael

    [2016-09-27T11:49:58.395139] STDERR: INFO 11:16:56,599 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-09-27T11:49:58.395718] STDERR: INFO 11:16:56,602 HelpFormatter - The Genome Analysis Toolkit (GATK) vnightly-2016-09-23-gfade77f, Compiled 2016/09/23 00:01:14
    [2016-09-27T11:49:58.395746] STDERR: INFO 11:16:56,603 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    [2016-09-27T11:49:58.395769] STDERR: INFO 11:16:56,603 HelpFormatter - For support and documentation go to https://www.broadinstitute.org/gatk
    [2016-09-27T11:49:58.395792] STDERR: INFO 11:16:56,604 HelpFormatter - [Tue Sep 27 11:16:56 CEST 2016] Executing on Linux 2.6.32-431.20.3.el6.x86_64 amd64
    [2016-09-27T11:49:58.395818] STDERR: INFO 11:16:56,604 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14
    [2016-09-27T11:49:58.395837] STDERR: INFO 11:16:56,612 HelpFormatter - Program Args: --analysis_type GenotypeGVCFs --dbsnp /data/prod/ngs_resources/gatk_bundle/2.8/b37/dbsnp_138.b37.vcf --excludeIntervals NC_007605 --excludeIntervals hs37d5 --num_threads 4 --out variant_calling_process_cohort_BSA_0051_KBL_HaloPlex
    _genotyped_raw_snp_raw_indel.vcf.gz --reference_sequence /dev/shm/variant_calling_process_cohort_BSA_0051_KBL_HaloPlex_cache/human_g1k_v37_decoy.fasta --variant variant_calling_merge_cohort_BSA_0051_KBL_HaloPlex_combined.g.vcf.gz
    [2016-09-27T11:49:58.395860] STDERR: INFO 11:16:56,619 HelpFormatter - Executing as mschuster@n004 on Linux 2.6.32-431.20.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14.
    [2016-09-27T11:49:58.395881] STDERR: INFO 11:16:56,620 HelpFormatter - Date/Time: 2016/09/27 11:16:56
    [2016-09-27T11:49:58.395899] STDERR: INFO 11:16:56,620 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-09-27T11:49:58.395916] STDERR: INFO 11:16:56,620 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-09-27T11:49:58.395932] STDERR: INFO 11:16:56,722 GenomeAnalysisEngine - Strictness is SILENT
    [2016-09-27T11:49:58.395949] STDERR: INFO 11:16:56,878 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    [2016-09-27T11:49:58.395965] STDERR: INFO 11:16:58,077 IntervalUtils - Initial include intervals span 3137454505 loci; exclude intervals span 35649766 loci
    [2016-09-27T11:49:58.395983] STDERR: INFO 11:16:58,078 IntervalUtils - Excluding 35649766 loci from original intervals (1.14% reduction)
    [2016-09-27T11:49:58.396002] STDERR: INFO 11:16:58,079 IntervalUtils - Processing 3101804739 bp from intervals
    [2016-09-27T11:49:58.396021] STDERR: WARN 11:16:58,079 IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation
    [2016-09-27T11:49:58.396046] STDERR: INFO 11:16:58,099 MicroScheduler - Running the GATK in parallel mode with 4 total threads, 1 CPU thread(s) for each of 4 data thread(s), of 32 processors available on this machine
    [2016-09-27T11:49:58.396080] STDERR: INFO 11:16:58,196 GenomeAnalysisEngine - Preparing for traversal
    [2016-09-27T11:49:58.396098] STDERR: INFO 11:16:58,213 GenomeAnalysisEngine - Done preparing for traversal
    [2016-09-27T11:49:58.396116] STDERR: INFO 11:16:58,214 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    [2016-09-27T11:49:58.396133] STDERR: INFO 11:16:58,223 ProgressMeter - | processed | time | per 1M | | total | remaining
    [2016-09-27T11:49:58.396150] STDERR: INFO 11:16:58,224 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
    [2016-09-27T11:49:58.396166] STDERR: DEBUG 2016-09-27 11:16:58 BlockCompressedOutputStream Using deflater: Deflater
    [2016-09-27T11:49:58.396182] STDERR: WARN 11:16:58,406 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    [2016-09-27T11:49:58.396205] STDERR: WARN 11:16:58,408 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    [2016-09-27T11:49:58.396222] STDERR: INFO 11:16:58,409 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files
    [2016-09-27T11:49:58.396245] STDERR: WARN 11:17:00,788 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs
    [2016-09-27T11:49:58.396262] STDERR: INFO 11:17:28,240 ProgressMeter - 1:3374301 0.0 30.0 s 49.6 w 0.1% 7.7 h 7.7 h
    [2016-09-27T11:49:58.396278] STDERR: INFO 11:17:58,243 ProgressMeter - 1:4086501 1000000.0 60.0 s 60.0 s 0.1% 12.7 h 12.6 h
    [2016-09-27T11:49:58.396298] STDERR: WARN 11:18:41,758 ExactAFCalculator - This tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 1: 4601135 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument. Unless the DEBUG lo
    gging level is used, this warning message is output just once per run and further warnings are suppressed.
    [2016-09-27T11:49:58.396317] STDERR: INFO 11:18:58,245 ProgressMeter - 1:6116101 3000000.0 120.0 s 40.0 s 0.2% 16.9 h 16.9 h
    ...
    [2016-09-27T14:11:06.307028] STDERR: INFO 14:10:01,905 ProgressMeter - 3:62030201 5.51449994E8 2.9 h 18.0 s 17.9% 16.1 h 13.3 h
    [2016-09-27T14:11:06.307049] STDERR: INFO 14:11:01,906 ProgressMeter - 3:66032201 5.55449994E8 2.9 h 18.0 s 18.0% 16.1 h 13.2 h
    [2016-09-27T14:11:06.307060] STDERR: ##### ERROR --
    [2016-09-27T14:11:06.307070] STDERR: ##### ERROR stack trace
    [2016-09-27T14:11:06.307080] STDERR: java.lang.NullPointerException
    [2016-09-27T14:11:06.307090] STDERR: at java.util.LinkedList$ListItr.next(LinkedList.java:893)
    [2016-09-27T14:11:06.307099] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.coveredByDeletion(GenotypingEngine.java:426)
    [2016-09-27T14:11:06.307114] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateOutputAlleleSubset(GenotypingEngine.java:387)
    [2016-09-27T14:11:06.307124] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.GenotypingEngine.calculateGenotypes(GenotypingEngine.java:251)
    [2016-09-27T14:11:06.307133] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:392)
    [2016-09-27T14:11:06.307143] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:375)
    [2016-09-27T14:11:06.307152] STDERR: at org.broadinstitute.gatk.tools.walkers.genotyper.UnifiedGenotypingEngine.calculateGenotypes(UnifiedGenotypingEngine.java:330)
    [2016-09-27T14:11:06.307162] STDERR: at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.regenotypeVC(GenotypeGVCFs.java:311)
    [2016-09-27T14:11:06.307175] STDERR: at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:289)
    [2016-09-27T14:11:06.307187] STDERR: at org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs.map(GenotypeGVCFs.java:132)
    [2016-09-27T14:11:06.307197] STDERR: at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
    [2016-09-27T14:11:06.307207] STDERR: at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
    [2016-09-27T14:11:06.307216] STDERR: at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    [2016-09-27T14:11:06.307226] STDERR: at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    [2016-09-27T14:11:06.307235] STDERR: at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
    [2016-09-27T14:11:06.307245] STDERR: at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
    [2016-09-27T14:11:06.307254] STDERR: at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
    [2016-09-27T14:11:06.307264] STDERR: at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
    [2016-09-27T14:11:06.307275] STDERR: at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    [2016-09-27T14:11:06.307285] STDERR: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    [2016-09-27T14:11:06.307301] STDERR: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    [2016-09-27T14:11:06.307318] STDERR: at java.lang.Thread.run(Thread.java:745)
    [2016-09-27T14:11:06.307334] STDERR: ##### ERROR ------------------------------------------------------------------------------------------
    [2016-09-27T14:11:06.307346] STDERR: ##### ERROR A GATK RUNTIME ERROR has occurred (version nightly-2016-09-23-gfade77f):
    [2016-09-27T14:11:06.307355] STDERR: ##### ERROR
    [2016-09-27T14:11:06.307365] STDERR: ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    [2016-09-27T14:11:06.307374] STDERR: ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
    [2016-09-27T14:11:06.307384] STDERR: ##### ERROR Visit our website and forum for extensive documentation and answers to
    [2016-09-27T14:11:06.307399] STDERR: ##### ERROR commonly asked questions https://www.broadinstitute.org/gatk
    [2016-09-27T14:11:06.307409] STDERR: ##### ERROR
    [2016-09-27T14:11:06.307418] STDERR: ##### ERROR MESSAGE: Code exception (see stack trace for error itself)
    [2016-09-27T14:11:06.307429] STDERR: ##### ERROR ------------------------------------------------------------------------------------------

    [2016-10-04T14:39:20.738635] STDERR: INFO 14:04:46,852 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-10-04T14:39:20.757284] STDERR: INFO 14:04:46,871 HelpFormatter - The Genome Analysis Toolkit (GATK) vnightly-2016-09-23-gfade77f, Compiled 2016/09/23 00:01:14
    [2016-10-04T14:39:20.757344] STDERR: INFO 14:04:46,872 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
    [2016-10-04T14:39:20.757528] STDERR: INFO 14:04:46,872 HelpFormatter - For support and documentation go to https://www.broadinstitute.org/gatk
    [2016-10-04T14:39:20.757920] STDERR: INFO 14:04:46,872 HelpFormatter - [Tue Oct 04 14:04:46 CEST 2016] Executing on Linux 2.6.32-431.20.3.el6.x86_64 amd64
    [2016-10-04T14:39:20.758326] STDERR: INFO 14:04:46,873 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14
    [2016-10-04T14:39:20.758361] STDERR: INFO 14:04:46,881 HelpFormatter - Program Args: --analysis_type GenotypeGVCFs --dbsnp /data/prod/ngs_resources/gatk_bundle/2.8/b37/dbsnp_138.b37.vcf --excludeIntervals NC_007605 --excludeIntervals hs37d5 --num_threads 1 --out variant_calling_process_cohort_BSA_0051_KBL_HaloPlex
    _genotyped_raw_snp_raw_indel.vcf.gz --reference_sequence /dev/shm/variant_calling_process_cohort_BSA_0051_KBL_HaloPlex_cache/human_g1k_v37_decoy.fasta --variant variant_calling_merge_cohort_BSA_0051_KBL_HaloPlex_combined.g.vcf.gz
    [2016-10-04T14:39:20.758388] STDERR: INFO 14:04:46,949 HelpFormatter - Executing as mschuster@n004 on Linux 2.6.32-431.20.3.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_102-b14.
    [2016-10-04T14:39:20.758646] STDERR: INFO 14:04:46,951 HelpFormatter - Date/Time: 2016/10/04 14:04:46
    [2016-10-04T14:39:20.758671] STDERR: INFO 14:04:46,951 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-10-04T14:39:20.758688] STDERR: INFO 14:04:46,952 HelpFormatter - ---------------------------------------------------------------------------------------------
    [2016-10-04T14:39:20.758704] STDERR: INFO 14:04:47,037 GenomeAnalysisEngine - Strictness is SILENT
    [2016-10-04T14:39:20.758719] STDERR: INFO 14:04:47,608 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    [2016-10-04T14:39:20.758735] STDERR: INFO 14:04:49,433 IntervalUtils - Initial include intervals span 3137454505 loci; exclude intervals span 35649766 loci
    [2016-10-04T14:39:20.758751] STDERR: INFO 14:04:49,439 IntervalUtils - Excluding 35649766 loci from original intervals (1.14% reduction)
    [2016-10-04T14:39:20.758945] STDERR: INFO 14:04:49,440 IntervalUtils - Processing 3101804739 bp from intervals
    [2016-10-04T14:39:20.759136] STDERR: WARN 14:04:49,440 IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation
    [2016-10-04T14:39:20.759489] STDERR: INFO 14:04:49,874 GenomeAnalysisEngine - Preparing for traversal
    [2016-10-04T14:39:20.759515] STDERR: INFO 14:04:49,901 GenomeAnalysisEngine - Done preparing for traversal
    [2016-10-04T14:39:20.759699] STDERR: INFO 14:04:49,902 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    [2016-10-04T14:39:20.759725] STDERR: INFO 14:04:49,904 ProgressMeter - | processed | time | per 1M | | total | remaining
    [2016-10-04T14:39:20.759743] STDERR: INFO 14:04:49,905 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
    [2016-10-04T14:39:20.759760] STDERR: DEBUG 2016-10-04 14:04:51 BlockCompressedOutputStream Using deflater: Deflater
    [2016-10-04T14:39:20.759777] STDERR: WARN 14:04:51,597 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    [2016-10-04T14:39:20.759794] STDERR: WARN 14:04:51,600 StrandBiasTest - StrandBiasBySample annotation exists in input VCF header. Attempting to use StrandBiasBySample values to calculate strand bias annotation values. If no sample has the SB genotype annotation, annotation may still fail.
    [2016-10-04T14:39:20.759811] STDERR: INFO 14:04:51,601 GenotypeGVCFs - Notice that the -ploidy parameter is ignored in GenotypeGVCFs tool as this is automatically determined by the input variant files
    [2016-10-04T14:39:20.760402] STDERR: WARN 14:04:57,478 HaplotypeScore - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.variantutils.GenotypeGVCFs
    [2016-10-04T14:39:20.760779] STDERR: INFO 14:05:19,913 ProgressMeter - 1:834401 0.0 30.0 s 49.6 w 0.0% 31.0 h 31.0 h
    [2016-10-04T14:39:20.760802] STDERR: INFO 14:06:19,916 ProgressMeter - 1:1338401 1000000.0 90.0 s 90.0 s 0.0% 57.9 h 57.9 h
    [2016-10-04T14:39:20.760819] STDERR: INFO 14:07:19,917 ProgressMeter - 1:1760901 1000000.0 2.5 m 2.5 m 0.1% 73.4 h 73.4 h
    [2016-10-04T14:39:20.760835] STDERR: INFO 14:08:19,919 ProgressMeter - 1:2271701 2000000.0 3.5 m 105.0 s 0.1% 79.6 h 79.6 h
    [2016-10-04T14:39:20.760851] STDERR: INFO 14:09:19,921 ProgressMeter - 1:3681401 3000000.0 4.5 m 90.0 s 0.1% 63.2 h 63.1 h
    [2016-10-04T14:39:20.761198] STDERR: WARN 14:10:02,085 ExactAFCalculator - This tool is currently set to genotype at most 6 alternate alleles in a given context, but the context at 1: 4601135 has 7 alternate alleles so only the top alleles will be used; see the --max_alternate_alleles argument. Unless the DEBUG lo
    gging level is used, this warning message is output just once per run and further warnings are suppressed.
    [2016-10-04T14:39:20.761233] STDERR: INFO 14:10:20,321 ProgressMeter - 1:5220001 5000000.0 5.5 m 66.0 s 0.2% 54.5 h 54.4 h
    ...
    [2016-10-06T00:41:14.358959] STDERR: INFO 00:39:53,583 ProgressMeter - GL000212.1:24901 3.100104058E9 34.6 h 40.0 s 99.9% 34.6 h 67.0 s
    [2016-10-06T00:41:14.358975] STDERR: INFO 00:40:53,585 ProgressMeter - GL000192.1:229001 3.101257243E9 34.6 h 40.0 s 100.0% 34.6 h 12.0 s
    [2016-10-06T00:41:14.358987] STDERR: DEBUG 2016-10-06 00:41:13 BlockCompressedOutputStream Using deflater: Deflater
    [2016-10-06T00:41:14.358996] STDERR: INFO 00:41:14,335 ProgressMeter - done 3.101804739E9 34.6 h 40.0 s 100.0% 34.6 h 0.0 s
    [2016-10-06T00:41:14.359007] STDERR: INFO 00:41:14,336 ProgressMeter - Total runtime 124584.43 secs, 2076.41 min, 34.61 hours

  • valentinvalentin Cambridge, MAMember, Dev

    I second @Sheila, that kind of exception is not supposed to be thrown by ListArray.next under any circumstances so is quite likely caused by a race-condition.

    That would be a bug to be fixed by either making sure GenotypeGVCFs is multi-thread friendly or change it to fail early with a nice message when -nt is used saying that that option is not supported for this tool.

    In any case the future GATK won't have the -nt option, so is perhaps better to get used to it early.

    Parallelism can be achieved also (and I would say preferably) by scatter-gathering across the genome.

  • fwksfwks chinaMember

    @Sheila Yes, when -nt opnion was removed, the reported error disappeared. However, it took me 15 days to run the program.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @fwks
    HI,

    Have a look at WDL. You can use that to speed up your workflows.

    -Sheila

  • jdenvirjdenvir Marshall UniversityMember

    I also came across this error, using GATK 3.7-0. Note that the documentation explicitly states

    Parallelism options
    This tool can be run in multi-threaded mode using this option.

    TreeReducible (-nt)

    which (if I understand correctly) implies that the GenotypeGVCFs tool is supposed to support parallelism. So it appears that the documentation is incorrect - can that be fixed?

    @valentin By "scatter-gathering across the genome", I understand you to mean, for example, running GenotypeGVCFs on a chromosome-by-chromosome basis, with each chromosome (or other genomic region) being processed in a separate thread, and then combining the resulting vcfs once all of those are complete. Is that correct? Can that be achieved simply by running the command with different -L options and the same set of input files?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    @jdenvir The tool supports that option, it's just not a very good way to parallelize it. In the GATK4 implementation we're replaced it by Apache Spark support which is much more stable and better overall.

Sign In or Register to comment.