VariantAnnotator Exceptions

ehsuehehsueh SaskatoonMember

Hi GATK developers,

I would like to run VQSR on my set of variants; however, they were called by samtools mpileup and bcftools. The error message from VQSR suggested me to run VariantAnnotator on my vcf to get a compatible set of annotations for training. When I tried to run the VariantAnnotator using my vcf (generated with samtools mpileup and bcftools), I came across the following errors. While I understand some annotations would not be computed since I didn't call my variants with one of the callers from GATK's best practice, I don't know why I am getting a fatal exception.

My command:
java -jar bin/GenomeAnalysisTK.jar -R reference.fasta -T VariantAnnotator -o output.vcf -L input.vcf -V input.vcf --useAllAnnotations -XA SnpEff -nt 40 -dt NONE

Running VariantAnnotator from GATK 3.6.0 on this vcf, exited with the following error but after successfully processing 4986 sites:

...
WARN 16:31:32,568 SpanningDeletions - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator
WARN 16:31:32,575 SpanningDeletions - Annotation will not be calculated, must be called from UnifiedGenotyper, not org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator
INFO 16:32:11,626 ProgressMeter - chr1A:294745 36.0 85.0 s 3.9 w 0.0% 17.2 w 17.2 w

ERROR --
ERROR stack trace

java.lang.NullPointerException
at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.doGenotypeCalculations(HeterozygosityUtils.java:198)
at org.broadinstitute.gatk.tools.walkers.annotator.HeterozygosityUtils.getHetCount(HeterozygosityUtils.java:223)
at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.calculateIC(AS_InbreedingCoeff.java:158)
at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.makeCoeffAnnotation(AS_InbreedingCoeff.java:147)
at org.broadinstitute.gatk.tools.walkers.annotator.AS_InbreedingCoeff.annotate(AS_InbreedingCoeff.java:139)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:221)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:203)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator.map(VariantAnnotator.java:357)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator.map(VariantAnnotator.java:114)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

I also tried running the annotator on the same vcf again using the latest version (GATK 3.7). This time, I got an empty output file with a different error message:

...
WARN 16:14:10,006 AnnotationUtils - DP annotation will not be calculated, must be called from HaplotypeCaller or MuTect2, not VariantAnnotator
WARN 16:14:10,007 OxoGReadCounts - Annotation will not be calculated, can only be called from MuTect2, not org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator
WARN 16:14:10,008 AnnotationUtils - SAC annotation will not be calculated, must be called from HaplotypeCaller or MuTect2, not VariantAnnotator
WARN 16:14:10,008 AnnotationUtils - SB annotation will not be calculated, must be called from HaplotypeCaller or MuTect2, not VariantAnnotator

ERROR --
ERROR stack trace

java.lang.IllegalStateException: ClusteredReadPosition: walker is not MuTect2
at org.broadinstitute.gatk.tools.walkers.cancer.ClusteredReadPosition.annotate(ClusteredReadPosition.java:134)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:230)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotatorEngine.annotateContext(VariantAnnotatorEngine.java:212)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator.map(VariantAnnotator.java:355)
at org.broadinstitute.gatk.tools.walkers.annotator.VariantAnnotator.map(VariantAnnotator.java:112)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.ShardTraverser.call(ShardTraverser.java:98)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: ClusteredReadPosition: walker is not MuTect2
ERROR ------------------------------------------------------------------------------------------

What might be the problem and how I could fix it? I want to try and make my vcf work with VQSR as I'm really interested in seeing how the machine-learning approach of VQSR would turn out on my data. If possible, I would like to avoid redoing the time-consuming variant calling step... > <

Thank you very much for your help. :)

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Simple answer: choose specific annotations that are appropriate, don't try to request all annotations. While that had some chance of working in the past, in recent years we added more complexity (because now we have multiple callers) and there are a few combinations that cause crashes. This will be addressed in the future GATK4 by a more sane annotation management system, but in 3.x versions it's too complex to fix easily.
  • ehsuehehsueh SaskatoonMember

    Thanks Geraldine.

    I tried specifying just a few annotations (and eventually just one annotation -A ExcessHet) and I still couldn't get it to run. I was getting a different error about system file handle limit (below) even though the ulimit on my box is set as unlimited.

    ERROR MESSAGE: An error occurred because there were too many files open concurrently; your system's open file handle limit is probably too small. See the unix ulimit command to adjust this limit or ask your system administrator for help.

    I remembered reading some discussion on the GATK forums regarding VQSR not function properly with multi-threading options (e.g. unable to retrieve output errors). So I tried the same command without the -nt option and it worked! Yay! :smile:

    I'm not quite sure whether the threading issue was an independent problem from the original issue. With GATK 3.7, removing -nt didn't fix that original run (still crashed on MuTect2 walker error). However, with the older version of GATK 3.6, removing -nt while keeping the rest of the command the same (i.e. running --useAllAnnotations) avoided that NullPointerException altogether and completed the run successfully (despite logging some warnings about specific annotations couldn't be used)! :smiley:

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @ehsueh
    Hi,

    Are you saying that you got VQSR from version 3.6 to work without multi-threading with your original VCF, but with 3.7 (without multi-threading), it throws the ERROR MESSAGE: ClusteredReadPosition: walker is not MuTect2 error?

    Thanks,
    Sheila

  • ehsuehehsueh SaskatoonMember

    Hi Sheila. Yes.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    That annotation can't be invoked by any tool other than MuTect2. Its possible that this worked previously due to an oversight but it shouldn't have.
Sign In or Register to comment.