Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

ERROR stack trace ; Unable to retrieve result ; A GATK RUNTIME ERROR has occurred

Thanks very much for your answers for my previous questions. It seems that I encountered another difficulties when I run the QVSR steps because some ERROR information was spotted on the screen. These Error info is as follows:

INFO 18:10:01,046 GaussianMixtureModel - Initializing model with 30 k-means iterations...
INFO 18:10:01,165 VariantRecalibratorEngine - Finished iteration 0.
INFO 18:10:01,186 VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.15059
INFO 18:10:01,196 VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.06115
INFO 18:10:01,206 VariantRecalibratorEngine - Finished iteration 15. Current change in mixture coefficients = 0.34881
INFO 18:10:01,208 VariantRecalibratorEngine - Convergence after 16 iterations!
INFO 18:10:01,211 VariantDataManager - Found 0 variants overlapping bad sites training tracks.
INFO 18:10:27,971 ProgressMeter - chr1:249230318 4.34e+06 90.0 s 20.0 s 100.0% 90.0 s 0.0 s

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to retrieve result
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)
Caused by: java.lang.NullPointerException
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.selectWorstVariants(VariantDataManager.java:278)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:333)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:132)
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
... 5 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR MESSAGE: Unable to retrieve result
ERROR ------------------------------------------------------------------------------------------

I think the parameter I set are all right:

java -jar /ifs1/ST_POP/USER/lantianming/HUM/bin/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar
-R /ifs1/ST_POP/USER/lantianming/HUM/reference_human/chr1.fa
--maxGaussians 4
-numBad 4000
-T VariantRecalibrator
-mode SNP
-input /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.recal_10.vcf
-resource:dbsnp,known=true,training=false,truth=false,prior=6.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/dbsnp_137.hg19.vcf
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/hapmap_3.3.hg19.vcf
-resource:omni,known=false,training=true,truth=false,prior=12.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/1000G_omni2.5.hg19.vcf
-an DP -an FS -an HaplotypeScore -an MQ0 -an MQ -an QD
-recalFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.recal
-tranchesFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.tranches
-rscriptFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.plot.R -nt 4
--TStranche 90.0 --TStranche 93.0 --TStranche 95.0 --TStranche 97.0

My input file is chr1 AND the sequencing depth is about 1× AND 4000 snp sites were call out by using UnifiedGenotyper.
So what I am not sure is that whether the number of snp sites were enough for doing VQSR?
Could you please give me some suggestions? thanks very much!!!


Best Answer


  • 5581681555816815 TNMember

    I am performing a VQSR with GATK 3.4 on whole genome VCF file and get the same error.

    INFO 11:02:35,585 VariantRecalibratorEngine - Evaluating full set of 4837109 variants...
    INFO 11:02:35,740 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.
    INFO 11:02:41,872 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to retrieve result
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)
    Caused by: java.lang.IllegalArgumentException: No data found.
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:88)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:408)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:156)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
    ... 5 more

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.4-0-g7e26428):
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk

    Any clue why whole genome also have "zero" bad variants?



  • SheilaSheila Broad InstituteMember, Broadie admin

    Hi Shuoguo,

    Can you post the exact command you ran? Are you running on indels or SNPs? It is possible to just not have enough overlap between your callset and the known variant datasets, even if you have a whole genome. This is especially true for indels.


  • 5581681555816815 TNMember

    @Sheila I have 3 samples mapped with bwa-mem, two succeeded and this is the only one failed (tried three times so not random failure).
    quite strangely, the same 3 samples were done variant call the same exact way except using bwa-aln for mapping, and VQSR all success.

    exact command i used:

    >>> Performing snp recalibration
    java -Xms20g -Xmx20g -XX:ParallelGCThreads=4 -Djava.io.tmpdir=/scratch_space \
        -jar GenomeAnalysisTK.jar \
        -T VariantRecalibrator \
        -R GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
        -nt 4 \
        -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.hg38.vcf.gz \
        -resource:omni,known=false,training=true,truth=true,prior=12.0 1000G_omni2.5.hg38.vcf.gz \
        -resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.hg38.vcf.gz \
        -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 Homo_sapiens_assembly38.variantEvalGoldStandard.vcf.gz \
        -an QD -an DP -an FS -an  SOR -an MQ \
        -an MQRankSum -an ReadPosRankSum \
        -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
        -mode SNP -input 040771_G1.mdup.vcf.gz \
        -recalFile 040771_G1.mdup.snp.recal \
        -tranchesFile 040771_G1.mdup.snp.tranches \
        -rscriptFile 040771_G1.mdup.snp.plots.R \
        -log 040771_G1.mdup.snp.recal.log 


  • SheilaSheila Broad InstituteMember, Broadie admin


    Okay. Can you try running without -nt 4. Users have reported random issues with multi-threading. Also, why are you running on each sample by itself? If you are trying to analyze the three samples together, it is best to perform joint variant calling and genotyping. https://www.broadinstitute.org/gatk/documentation/article?id=4150


  • 5581681555816815 TNMember

    @Sheila Thanks. Will remove -nt. Yes I can try joint call as well.

  • 5581681555816815 TNMember

    @Sheila Removing "-nt 4" resolved the issue! Great!

Sign In or Register to comment.