Error in force calling variants using HaplotypeCaller

chetanyachetanya New York CityMember

Hi,

I would like to force call a list of variants across my cohort using HaplotypeCaller to get more accurate QC metrics for each variant. I am using the following command:

java -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R ucsc.hg19.fasta -et NO_ET -K my.key -I my.cohort.list --alleles my.vcf -L my.vcf -out_mode EMIT_ALL_SITES -gt_mode GENOTYPE_GIVEN_ALLELES -stand_call_conf 30.0 -stand_emit_conf 0.0 -dt NONE -o final_my.vcf

Here is a link to the input VCF file: VCF File

Unfortunately, I keep running into the following error (I've tried GATK ver3.3 and ver3.5):

INFO 18:49:21,288 ProgressMeter - chr1:11177077 21138.0 49.5 m 39.0 h 69.4% 71.3 m 21.8 m
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace
java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at htsjdk.variant.variantcontext.VariantContext.getAlternateAllele(VariantContext.java:845)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:248)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:1059)
at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:221)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:319)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.3.0-mssm-0-gaa95802):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Index: 3, Size: 3
##### ERROR ------------------------------------------------------------------------------------------

Would appreciate your help in solving this issue.

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @chetanya
    Hi,

    Can you try two things?

    1) Try validating your input VCF with ValidateVariants.

    2) Try running with only -L your.vcf. Don't include --alleles my.vcf f -out_mode EMIT_ALL_SITES -gt_mode GENOTYPE_GIVEN_ALLELES
    I think GGA mode might not be working properly in HaplotypeCaller.

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @chetanya Are you running with a version compiled internally? I say that because of the line

    `##### ERROR A GATK RUNTIME ERROR has occurred (version 3.3.0-mssm-0-gaa95802)
    

    That's not a version hash generated from our codebase.

    If so please try running with the precompiled version we provide to verify whether the problem is specific to your version or not.

  • chetanyachetanya New York CityMember

    @Sheila said:
    @chetanya
    Hi,

    Thanks for the prompt reply.

    Can you try two things?

    1) Try validating your input VCF with ValidateVariants.

    ValidateVariants did not find any non-obvious issues. The only thing it complains about is:
    WARN 14:47:24,690 ValidateVariants - ***** one or more of the ALT allele(s) for the record at position chr19:45854919 are not observed at all in the sample genotypes *****

    As this is a multi-sample VCF, this is expected.

    2) Try running with only -L your.vcf. Don't include --alleles my.vcf f -out_mode EMIT_ALL_SITES -gt_mode GENOTYPE_GIVEN_ALLELES
    I think GGA mode might not be working properly in HaplotypeCaller.

    Okay, I'm trying this right now.

    -Sheila

  • chetanyachetanya New York CityMember

    @Geraldine_VdAuwera said:
    @chetanya Are you running with a version compiled internally? I say that because of the line

    `##### ERROR A GATK RUNTIME ERROR has occurred (version 3.3.0-mssm-0-gaa95802)
    

    That's not a version hash generated from our codebase.

    If so please try running with the precompiled version we provide to verify whether the problem is specific to your version or not.

    We have ver3.3 compiled locally. I also tried a dowloaded copy of ver3.5 and ran into the same error:

    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR stack trace
    java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
    at java.util.ArrayList.rangeCheck(ArrayList.java:635)
    at java.util.ArrayList.get(ArrayList.java:411)
    at htsjdk.variant.variantcontext.VariantContext.getAlternateAllele(VariantContext.java:886)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCallerGenotypingEngine.assignGenotypeLikelihoods(HaplotypeCallerGenotypingEngine.java:252)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:924)
    at org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:228)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:709)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:705)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
    at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:274)
    at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
    at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
    ##### ERROR
    ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ##### ERROR Visit our website and forum for extensive documentation and answers to
    ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ##### ERROR
    ##### ERROR MESSAGE: Index: 3, Size: 3
    ##### ERROR ------------------------------------------------------------------------------------------

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I see -- any chance you're running on Java 8?

  • chetanyachetanya New York CityMember

    @Geraldine_VdAuwera said:
    I see -- any chance you're running on Java 8?

    No, Java 7. Will that be an issue? I believe the documentation mentioned any JDK version >1.6, correct?

  • chetanyachetanya New York CityMember

    BTW, your documentation server seems to be down. I am unable to access any GATK docs :- (

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Docs are working now -- possibly a temporary server error.

    Java 7 is perfect -- the only one we support at the moment, though we're preparing to migrate to 8. I was asking because we've seen some errors due to changes in how some data structures work in Java 8, which is not yet supported.

    Consider also validating your input bam file with Picard ValidateSamFile, btw.

Sign In or Register to comment.