The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

ERROR stack trace ; Unable to retrieve result ; A GATK RUNTIME ERROR has occurred

tianmingtianming ChinaPosts: 4Member

Hi,
Thanks very much for your answers for my previous questions. It seems that I encountered another difficulties when I run the QVSR steps because some ERROR information was spotted on the screen. These Error info is as follows:

INFO 18:10:01,046 GaussianMixtureModel - Initializing model with 30 k-means iterations...
INFO 18:10:01,165 VariantRecalibratorEngine - Finished iteration 0.
INFO 18:10:01,186 VariantRecalibratorEngine - Finished iteration 5. Current change in mixture coefficients = 0.15059
INFO 18:10:01,196 VariantRecalibratorEngine - Finished iteration 10. Current change in mixture coefficients = 0.06115
INFO 18:10:01,206 VariantRecalibratorEngine - Finished iteration 15. Current change in mixture coefficients = 0.34881
INFO 18:10:01,208 VariantRecalibratorEngine - Convergence after 16 iterations!
INFO 18:10:01,211 VariantDataManager - Found 0 variants overlapping bad sites training tracks.
INFO 18:10:27,971 ProgressMeter - chr1:249230318 4.34e+06 90.0 s 20.0 s 100.0% 90.0 s 0.0 s

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to retrieve result
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)
Caused by: java.lang.NullPointerException
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.selectWorstVariants(VariantDataManager.java:278)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:333)
at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:132)
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
... 5 more

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Unable to retrieve result
ERROR ------------------------------------------------------------------------------------------

I think the parameter I set are all right:

java -jar /ifs1/ST_POP/USER/lantianming/HUM/bin/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar
-R /ifs1/ST_POP/USER/lantianming/HUM/reference_human/chr1.fa
--maxGaussians 4
-numBad 4000
-T VariantRecalibrator
-mode SNP
-input /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.recal_10.vcf
-resource:dbsnp,known=true,training=false,truth=false,prior=6.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/dbsnp_137.hg19.vcf
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/hapmap_3.3.hg19.vcf
-resource:omni,known=false,training=true,truth=false,prior=12.0 /nas/RD_09C/resequencing/soft/pipeline/GATK/bundle/2.5/hg19/1000G_omni2.5.hg19.vcf
-an DP -an FS -an HaplotypeScore -an MQ0 -an MQ -an QD
-recalFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.recal
-tranchesFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.tranches
-rscriptFile /ifs1/ST_POP/USER/lantianming/HUM/align/bwa/split_1_22_X_Y_M/chr1/chr1.vcf.snp_11.plot.R -nt 4
--TStranche 90.0 --TStranche 93.0 --TStranche 95.0 --TStranche 97.0

My input file is chr1 AND the sequencing depth is about 1× AND 4000 snp sites were call out by using UnifiedGenotyper.
So what I am not sure is that whether the number of snp sites were enough for doing VQSR?
Could you please give me some suggestions? thanks very much!!!

Tagged:

Best Answer

Answers

  • 5581681555816815 TNPosts: 22Member

    I am performing a VQSR with GATK 3.4 on whole genome VCF file and get the same error.

    INFO 11:02:35,585 VariantRecalibratorEngine - Evaluating full set of 4837109 variants...
    INFO 11:02:35,740 VariantDataManager - Training with worst 0 scoring variants --> variants with LOD <= -5.0000.
    INFO 11:02:41,872 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.gatk.utils.exceptions.ReviewedGATKException: Unable to retrieve result
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)
    Caused by: java.lang.IllegalArgumentException: No data found.
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibratorEngine.generateModel(VariantRecalibratorEngine.java:88)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:408)
    at org.broadinstitute.gatk.tools.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:156)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226)
    at org.broadinstitute.gatk.engine.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183)
    ... 5 more

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.4-0-g7e26428):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR

    Any clue why whole genome also have "zero" bad variants?

    thanks,

    Shuoguo

  • SheilaSheila Broad InstitutePosts: 3,735Member, Broadie, Moderator, Dev admin

    @55816815
    Hi Shuoguo,

    Can you post the exact command you ran? Are you running on indels or SNPs? It is possible to just not have enough overlap between your callset and the known variant datasets, even if you have a whole genome. This is especially true for indels.

    -Sheila

  • 5581681555816815 TNPosts: 22Member

    @Sheila I have 3 samples mapped with bwa-mem, two succeeded and this is the only one failed (tried three times so not random failure).
    quite strangely, the same 3 samples were done variant call the same exact way except using bwa-aln for mapping, and VQSR all success.

    exact command i used:

    >>> Performing snp recalibration
    java -Xms20g -Xmx20g -XX:ParallelGCThreads=4 -Djava.io.tmpdir=/scratch_space \
        -jar GenomeAnalysisTK.jar \
        -T VariantRecalibrator \
        -R GCA_000001405.15_GRCh38_no_alt_analysis_set.fna \
        -nt 4 \
        -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.hg38.vcf.gz \
        -resource:omni,known=false,training=true,truth=true,prior=12.0 1000G_omni2.5.hg38.vcf.gz \
        -resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.hg38.vcf.gz \
        -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 Homo_sapiens_assembly38.variantEvalGoldStandard.vcf.gz \
        -an QD -an DP -an FS -an  SOR -an MQ \
        -an MQRankSum -an ReadPosRankSum \
        -tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
        -mode SNP -input 040771_G1.mdup.vcf.gz \
        -recalFile 040771_G1.mdup.snp.recal \
        -tranchesFile 040771_G1.mdup.snp.tranches \
        -rscriptFile 040771_G1.mdup.snp.plots.R \
        -log 040771_G1.mdup.snp.recal.log 
    

    thanks!

  • SheilaSheila Broad InstitutePosts: 3,735Member, Broadie, Moderator, Dev admin

    @55816815
    Hi,

    Okay. Can you try running without -nt 4. Users have reported random issues with multi-threading. Also, why are you running on each sample by itself? If you are trying to analyze the three samples together, it is best to perform joint variant calling and genotyping. https://www.broadinstitute.org/gatk/documentation/article?id=4150

    -Sheila

  • 5581681555816815 TNPosts: 22Member

    @Sheila Thanks. Will remove -nt. Yes I can try joint call as well.

  • 5581681555816815 TNPosts: 22Member

    @Sheila Removing "-nt 4" resolved the issue! Great!

Sign In or Register to comment.