Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ArrayIndexOutOfBoundsException in VariantsToBinaryPed

bredesonbredeson Member ✭✭
edited September 2014 in Ask the GATK team

Hey GATK Team,

Ive encountered a GATK runtime error, which says might be the result of a bug, but tracked it down to a file suffix issue. I tried GATKv3.2-2 and GATKv2.7-2 and the "problem" seems common to both... When my input metaData file is suffixed with .meta I get the following issue, but when it ends in .fam it runs successfully. My guess is that it's not checking that the input file ends in .fam?

INFO 11:55:47,955 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:55:47,957 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.7-2-g6bda569, Compiled 2013/08/28 16:30:29
INFO 11:55:47,957 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 11:55:47,957 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 11:55:47,960 HelpFormatter - Program Args: -T VariantsToBinaryPed -R v5_0.chr+cpDNA.fa -V v5.0.combined.biSNP.vcf --bed v5.0.combined.biSNP.bed --bim v5.0.combined.biSNP.bim --fam v5.0.combined.biSNP.fam --minGenotypeQuality 30 --metaData ./v5.0.combined.biSNP.meta
INFO 11:55:47,961 HelpFormatter - Date/Time: 2014/09/27 11:55:47
INFO 11:55:47,961 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:55:47,961 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:55:47,966 ArgumentTypeDescriptor - Dynamically determined type of v5.0.combined.biSNP.vcf to be VCF
INFO 11:55:48,520 GenomeAnalysisEngine - Strictness is SILENT
INFO 11:55:49,019 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 11:55:49,038 RMDTrackBuilder - Loading Tribble index from disk for file v5.0.combined.biSNP.vcf
INFO 11:55:49,686 GenomeAnalysisEngine - Preparing for traversal
INFO 11:55:49,716 GenomeAnalysisEngine - Done preparing for traversal
INFO 11:55:49,716 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 11:55:49,716 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 11:55:50,525 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: 1
at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToBinaryPed.parseMetaData(VariantsToBinaryPed.java:483)
at org.broadinstitute.sting.gatk.walkers.variantutils.VariantsToBinaryPed.initialize(VariantsToBinaryPed.java:141)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: 1
ERROR ------------------------------------------------------------------------------------------

INFO 11:56:03,420 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:56:03,422 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.2-2-gec30cee, Compiled 2014/07/17 15:22:03
INFO 11:56:03,422 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 11:56:03,422 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 11:56:03,425 HelpFormatter - Program Args: -T VariantsToBinaryPed -R v5_0.chr+cpDNA.fa -V v5.0.combined.biSNP.vcf --bed v5.0.combined.biSNP.bed --bim v5.0.combined.biSNP.bim --fam v5.0.combined.biSNP.fam --minGenotypeQuality 30 --metaData ./v5.0.combined.biSNP.meta
INFO 11:56:03,429 HelpFormatter - Executing as XXXXXXXX on Linux 2.6.32-431.20.3.el6.nersc.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13.
INFO 11:56:03,429 HelpFormatter - Date/Time: 2014/09/27 11:56:03
INFO 11:56:03,429 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:56:03,430 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:56:03,917 GenomeAnalysisEngine - Strictness is SILENT
INFO 11:56:04,427 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 11:56:05,106 GenomeAnalysisEngine - Preparing for traversal
INFO 11:56:05,156 GenomeAnalysisEngine - Done preparing for traversal
INFO 11:56:05,157 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 11:56:05,157 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 11:56:05,158 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 11:56:05,969 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: 1
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.parseMetaData(VariantsToBinaryPed.java:489)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.initialize(VariantsToBinaryPed.java:141)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:314)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.2-2-gec30cee):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: 1
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I believe the parser tries to guess how it's supposed to parse the file contents depending on the suffix, and when it's something it doesn't know (".meta" is not standard) it's not guaranteed to do the wrong thing. We'll see if we can put in a check, but this is really low priority.

  • bredesonbredeson Member ✭✭

    Hey Geraldine,

    It's not a problem. Just thought a more meaningful exception message would be helpful for others, should they encounter the same issue. An ArrayIndexOutOfBoundsException is a little misleading.

  • flowflow Member
    edited September 2017

    Hello,

    I thought I'd resurrect this thread as I am having the same issue with VariantsToBinaryPed using version 3.7-0.

    My command line:

    java -Xmx40g -jar "${GATK_JAR}" -T VariantsToBinaryPed -R "${ref}" -V "${vcf}" -m "${meta}" --outputMode SNP_MAJOR --minGenotypeQuality 0 -bed "${bed}" -bim "${bim}" -fam "${fam}"
    

    My meta data file has a .txt extension and is in the format:

    <sample id><tab>fid=<family id>;sex=other;phenotype=-9
    

    The error message is:

    ##### ERROR --
    ##### ERROR stack trace 
    java.lang.ArrayIndexOutOfBoundsException: -1
            at htsjdk.variant.variantcontext.GenotypeLikelihoods.getGQLog10FromLikelihoods(GenotypeLikelihoods.java:220)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.checkGQIsGood(VariantsToBinaryPed.java:442)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.getStandardEncoding(VariantsToBinaryPed.java:406)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.getEncoding(VariantsToBinaryPed.java:398)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.writeSNPMajor(VariantsToBinaryPed.java:315)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.map(VariantsToBinaryPed.java:269)
            at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.map(VariantsToBinaryPed.java:103)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
            at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
            at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
            at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
            at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:98)
            at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
            at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
            at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
            at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
            at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
    ##### ERROR
    ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ##### ERROR Visit our website and forum for extensive documentation and answers to 
    ##### ERROR commonly asked questions https://software.broadinstitute.org/gatk
    ##### ERROR
    ##### ERROR MESSAGE: -1
    ##### ERROR ------------------------------------------------------------------------------------------
    

    Thanks in advance for your assistance.

  • flowflow Member

    I also would like to add that I am not working on human in case this tool is only compatible with the human reference.

  • flowflow Member
    edited September 2017

    After posting I found this thread which seems likely to be due to same problem. Sorry for missing this originally

    https://gatkforums.broadinstitute.org/gatk/discussion/10294/variantstobinaryped-java-lang-arrayindexoutofboundsexception-1#latest

    I also ran ValidateVariants and the VCF validated.

    Post edited by flow on
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @flow
    Hi,

    Yes, please keep an eye on that thread. It looks like someone else submitted a bug report. I will have a look soon and report back in that thread.

    Thanks,
    Sheila

Sign In or Register to comment.