The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

VariantRecalibrator: ERROR stack trace

cgecge Member Posts: 3
edited December 2012 in Ask the GATK team

Hi,
I'm encountering this error running VariantRecalibrator with data from 3 samples (I'm testing):
Maybe is the problem due to small sample size?

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
java.lang.NullPointerException
        at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantDataManager.selectWorstVariants(VariantDataManager.java:179)
        at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:306)
        at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:107)
        at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
        at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:97)
        at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
        at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237)
        at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147)
        at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.2-16-g9f648cb):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Code exception (see stack trace for error itself)
##### ERROR ------------------------------------------------------------------------------------------

Thanks

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,371 admin

    That is possible, although the program should tell you explicitly that the dataset is too small, rather than fail like this. Can you tell me what is your command line and what is your dataset like?

    Geraldine Van der Auwera, PhD

  • ispirasispiras Posts: 3

    Thanks for your answer.
    My command is:

    java -Xmx24g -jar ${GATK} \
    -T VariantRecalibrator \
    -R $REF \
    -input test_3_samples.mark_dup.indel.rc.bam.raw.vcf \
    -resource:hapmap,known=false,training=true,truth=true,prior=15.0 $ref_dir/hapmap_3.3.b37.sites.vcf \
    -resource:omni,known=false,training=true,truth=false,prior=12.0 $ref_dir/1000G_omni2.5.b37.sites.vcf \
    -resource:dbsnp,known=true,training=false,truth=false,prior=6.0 $ref_dir/dbSNP137.vcf \
    -an QD -an HaplotypeScore -an MQRankSum -an ReadPosRankSum -an FS -an MQ \
    --maxGaussians 4 \
    --percentBadVariants 0.05 \
    --minNumBadVariants 1000 \
    -mode SNP \
    -recalFile $final_dir/output.recal \
    -tranchesFile $final_dir/output.tranches \
    -rscriptFile $final_dir/output.plots.R \

    Moreover before the error message that I posted yesterday, there are this warning:

    [......]
    INFO 11:59:16,223 VariantRecalibratorEngine - Evaluating full set of 1072 variants...
    INFO 11:59:16,223 VariantDataManager - Found 0 variants overlapping bad sites training tracks.
    WARN 11:59:16,224 VariantDataManager - WARNING: Training with very few variant sites! Please check the model reporting PDF to ensure the quality of the model is reliable.
    INFO 11:59:17,158 GATKRunReport - Uploaded run statistics report to AWS S3
    [.......]

    Thanks

  • ispirasispiras Posts: 3

    They are data from targeted resequencing (1 gene, approximately 300 Kb)

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,371 admin

    Ah, yes that makes sense. That is not enough data for variant recalibration. You should use hard filtering; please see the Best Practices documentation for our recommendations for small datasets.

    Geraldine Van der Auwera, PhD

  • ispirasispiras Posts: 3

    Thank you very much!

Sign In or Register to comment.