NPE in GenotypeAndValidate

pdexheimerpdexheimer Member ✭✭✭✭
edited October 2012 in Ask the GATK team

I'm encountering a NullPointerException when I try to run GenotypeAndValidate:

INFO  08:31:36,498 ArgumentTypeDescriptor - Dynamically determined type of gold.vcf to be VCF 
INFO  08:31:36,542 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  08:31:36,542 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.1-13-g1706365, Compiled 2012/10/12 19:21:06 
INFO  08:31:36,542 HelpFormatter - Copyright (c) 2010 The Broad Institute 
INFO  08:31:36,543 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
INFO  08:31:36,543 HelpFormatter - Program Args: -T GenotypeAndValidate -R human_g1k_v37.fasta -I clean/sample1.bam [...] -I clean/sample22.bam -alleles gold.vcf -L gold.vcf -o test.vcf 
INFO  08:31:36,544 HelpFormatter - Date/Time: 2012/10/23 08:31:36 
INFO  08:31:36,544 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  08:31:36,544 HelpFormatter - --------------------------------------------------------------------------------- 
INFO  08:31:36,554 ArgumentTypeDescriptor - Dynamically determined type of gold.vcf to be VCF 
INFO  08:31:36,558 GenomeAnalysisEngine - Strictness is SILENT 
INFO  08:31:36,675 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
INFO  08:31:37,048 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.37 
INFO  08:31:37,141 RMDTrackBuilder - Loading Tribble index from disk for file gold.vcf 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
java.lang.NullPointerException
    at org.broadinstitute.sting.gatk.walkers.validation.GenotypeAndValidate.map(GenotypeAndValidate.java:455)
    at org.broadinstitute.sting.gatk.walkers.validation.GenotypeAndValidate.map(GenotypeAndValidate.java:193)
    at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:65)
    at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:62)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:265)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.1-13-g1706365):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Code exception (see stack trace for error itself)
##### ERROR ------------------------------------------------------------------------------------------

I've validated gold.vcf, the only complaints are that I didn't put the reference/contig headers in. Any suggestions?

Comments

  • ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

    Thanks for the report. I just pushed in a fix that will be available in the next release (which we are aiming to have out next week). For now, you can get around this by making sure all records have the GV annotation in the INFO field (GV=T or GV=F).

  • pdexheimerpdexheimer Member ✭✭✭✭

    Thanks for the workaround. When I added that field to the VCF, I got a User Error that the "callStatus" INFO key isn't specified in the VCF header. Of course, that key's not in my VCF - but running with the suggested -U switch at least allowed the tool to run.

  • aldoaldo Member
    edited November 2012

    I have been trying the workaround, but even this failed to work for me:

    [[email protected] Lifescope_Exomes_Martin]$ java -Xmx48G -jar /home/corona/GenomeAnalysisTK-2.2-5-g3bf5e3f/GenomeAnalysisTK.jar -T GenotypeAndValidate -I /home/corona/Desktop/../Genomes/recalibrated.realigned.bam -R /home/corona/Desktop/gatk/lifescope.hg19.fa --alleles BC2_GV.recalibrated.vcf -L BC2_GV.recalibrated.vcf -o ExomeCallsInGenome.vcf -l INFO -U ALL
    INFO 13:04:01,482 ArgumentTypeDescriptor - Dynamically determined type of BC2_GV.recalibrated.vcf to be VCF
    INFO 13:04:01,533 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:04:01,533 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.2-5-g3bf5e3f, Compiled 2012/11/09 14:27:28
    INFO 13:04:01,533 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 13:04:01,534 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 13:04:01,538 HelpFormatter - Program Args: -T GenotypeAndValidate -I /home/corona/Desktop/../Genomes/recalibrated.realigned.bam -R /home/corona/Desktop/gatk/lifescope.hg19.fa --alleles BC2_GV.recalibrated.vcf -L BC2_GV.recalibrated.vcf -o ExomeCallsInGenome.vcf -l INFO -U ALL
    INFO 13:04:01,538 HelpFormatter - Date/Time: 2012/11/27 13:04:01
    INFO 13:04:01,538 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:04:01,538 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:04:01,544 ArgumentTypeDescriptor - Dynamically determined type of BC2_GV.recalibrated.vcf to be VCF
    INFO 13:04:01,548 GenomeAnalysisEngine - Strictness is SILENT
    INFO 13:04:01,818 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE Target Coverage: 1000
    INFO 13:04:01,827 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 13:04:01,889 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.06
    INFO 13:04:01,902 RMDTrackBuilder - Loading Tribble index from disk for file BC2_GV.recalibrated.vcf
    INFO 13:04:04,678 GenomeAnalysisEngine - Processing 159544 bp from intervals
    INFO 13:04:04,703 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 13:04:04,703 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 13:04:15,244 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.NullPointerException
    at org.broadinstitute.sting.gatk.walkers.validation.GenotypeAndValidate.map(GenotypeAndValidate.java:430)
    at org.broadinstitute.sting.gatk.walkers.validation.GenotypeAndValidate.map(GenotypeAndValidate.java:193)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:287)
    at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:252)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67)
    at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.2-5-g3bf5e3f):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Code exception (see stack trace for error itself)
    ERROR ------------------------------------------------------------------------------------------

    Here the lines within the VCF to show the format (I also added the GV to the INFO field specifications):

    source_20121027.1=vcf-subset(-1, set by base.pm) -c BC2.bam recalibrated.vcf

    CHROM POS ID REF ALT QUAL FILTER INFO FORMAT BC2.bam

    chr1 14907 . A G 826.89 VQSRTrancheBOTH99.90to100.00 GV=T;ABHet=0.544;ABHom=0.310;AC=2;AF=0.917;AN=2;BaseQRankSum=-20.329;DP=1621;Dels=0.00;FS=18.313;HaplotypeScore=1.0102
    ;InbreedingCoeff=-0.1825;MLEAC=22;MLEAF=0.917;MQ0=257;MQ=11.81;MQRankSum=6.650;OND=0.602;QD=0.78;ReadPosRankSum=1.488;VQSLOD=-3.591e+00;culprit=QD GT:AD:DP:GQ:PL 1/1:47,39:78:9:82,9,0

    Where am I missing the point? Is this solved in the latest GATK version?
    Thanks for any help!!

    regards,
    Aldo

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    We'll take a look at this bug. Could you please isolate a snippet of your BAM file where the error occurs and upload it to our FTP for testing?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @aldo, we've finally got around to looking at your bug report (sorry it took so long!), but we seem to have lost your file. Could you please re-upload it to our FTP?

Sign In or Register to comment.