Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Variant Recalibration generates NullPointerException at VariantRecalibratorEngine.generateModel

jitendrasbhatijitendrasbhati IndiaPosts: 24Member

I am following the GATK best practices workflow on Linux. I am using chr21.fa as reference from NCBI and .vcf files are taken from resource bundle and dbsnp.vcf is from NCBI. I am working on the VariantRecalibrator step. It gives me NullPointerException. I am attaching the console error that I have got which contains the command also and raw_variants.vcf which is generated in previous step by HaplotypeCaller tool in txt format. Please let me know what can be the issue.

txt
txt
variantrecalibratorerror.txt
11K
txt
txt
raw_variants-Txt.txt
236K

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Hi Jitendra,

    The error is occurring because you are running on an extremely small set of variant calls. The variant recalibrator cannot build a proper model with so few variants. I will see if we can provide better error handling and a more informative error message in the future, but in any case VQSR will not work on such a small dataset. I assume you are doing this for testing purposes, but please be aware that it is simply not possible to do small-scale testing of VQSR.

    Geraldine Van der Auwera, PhD

  • shazlyshazly EgyptPosts: 1Member

    Hi Geraldine,

    I'm getting the same error although my variant calls file is 3.2M, Is this considered a small file too or else do you have any suggestions why i'm getting this error in my case?

    And This the command i run:

    java -Xmx4g -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R ~/bwa_references/human_g1k_v37.fasta -input ~/simplex/ngs_test/snp_calling/aln.haplotyper.raw.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=8.0 ~/bundle/current/dbsnp_137.b37.vcf -resource:hapmap,VCF,known=false,training=true,truth=true,prior=15.0 ~/bundle/current/hapmap_3.3.b37.vcf -resource:omni,VCF,known=false,training=true,truth=false,prior=12.0 ~/bundle/current/1000G_omni2.5.b37.vcf -an QD -an MQRankSum -an ReadPosRankSum -recalFile ~/ngs_test/snp_recal/aln.snp.recal -tranche 100 -tranche 99.9 -tranche 99.0 -tranche 90 -tranchesFile ~/ngs_test/snp_recal/aln.snp.tranches -rscriptFile ~/ngs_test/snp_recal/aln.snp.plots.R -mode SNP

    Thanks,

    Shazly

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,276Administrator, GSA Member admin

    Hi @shazly,

    That's right, based on the filesize it's likely that you don't have enough variants for recalibration. See the solutions proposed in the documentation for dealing with this problem.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.