Complete this survey about your research needs and be entered to win an Amazon gift card or FireCloud credit.
Read more about it here!
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.6 is out. See the GATK4 beta page for download and details.

VariantsToBinaryPed java.lang.ArrayIndexOutOfBoundsException: -1

Hello, can you please help me sort out the following error in running VariantsToBinaryPed:

java -jar /sb/project/fkr-592-aa/data/GalWaRat/bin/third/gatk-3.7/GenomeAnalysisTK.jar -T VariantsToBinaryPed -R /sb/project/fkr-592-aa/genomes/CfloGapsClosed6/Cflo_3.3_gaps_closed6.fasta -V /sb/project/fkr-592-aa/Danzqianqi/Cflo/WGS/filteredSNPss.vcf -m sample_phenotypeinfo2.fam --minGenotypeQuality 0 --bed filteredSNPss.bed --bim filteredSNPss.bim --fam filteredSNPss.fam
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/gs/scratch/zqianqi
INFO 19:31:00,898 HelpFormatter - ----------------------------------------------------------------------------------
INFO 19:31:00,901 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.7-0-gcfedb67, Compiled 2016/12/12 11:21:18
INFO 19:31:00,902 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO 19:31:00,902 HelpFormatter - For support and documentation go to https://software.broadinstitute.org/gatk
INFO 19:31:00,902 HelpFormatter - [Tue Sep 05 19:31:00 EDT 2017] Executing on Linux 2.6.32-642.13.1.el6.x86_64 amd64
INFO 19:31:00,902 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02
INFO 19:31:00,906 HelpFormatter - Program Args: -T VariantsToBinaryPed -R /sb/project/fkr-592-aa/genomes/CfloGapsClosed6/Cflo_3.3_gaps_closed6.fasta -V /sb/project/fkr-592-aa/Danzqianqi/Cflo/WGS/filteredSNPss.vcf -m sample_phenotypeinfo2.fam --minGenotypeQuality 0 --bed filteredSNPss.bed --bim filteredSNPss.bim --fam filteredSNPss.fam
INFO 19:31:00,910 HelpFormatter - Executing as zqianqi@lg-1r17-n03 on Linux 2.6.32-642.13.1.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02.
INFO 19:31:00,911 HelpFormatter - Date/Time: 2017/09/05 19:31:00
INFO 19:31:00,911 HelpFormatter - ----------------------------------------------------------------------------------
INFO 19:31:00,911 HelpFormatter - ----------------------------------------------------------------------------------
INFO 19:31:00,922 GenomeAnalysisEngine - Strictness is SILENT
INFO 19:31:47,656 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 19:32:39,018 GenomeAnalysisEngine - Preparing for traversal
INFO 19:32:39,044 GenomeAnalysisEngine - Done preparing for traversal
INFO 19:32:39,045 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 19:32:39,045 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 19:32:39,046 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime

ERROR --
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: -1
at htsjdk.variant.variantcontext.GenotypeLikelihoods.getGQLog10FromLikelihoods(GenotypeLikelihoods.java:220)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.checkGQIsGood(VariantsToBinaryPed.java:442)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.getStandardEncoding(VariantsToBinaryPed.java:406)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.getEncoding(VariantsToBinaryPed.java:398)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.writeIndividualMajor(VariantsToBinaryPed.java:282)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.map(VariantsToBinaryPed.java:267)
at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.map(VariantsToBinaryPed.java:103)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:267)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:255)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:144)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:92)
at org.broadinstitute.gatk.engine.traversals.TraverseLociNano.traverse(TraverseLociNano.java:48)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:98)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:316)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.7-0-gcfedb67):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: -1
ERROR ------------------------------------------------------------------------------------------

My .vcf file was made with HaplotypeCaller/GenotypeGVCFs/SelectVariants/VariantFiltration. I used ValidateVariants as well.

This is a snapshot of the .vcf file:

reference=file:///sb/project/fkr-592-aa/genomes/CfloGapsClosed6/Cflo_3.3_gaps_closed6.fasta

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 1 12 13 15 2 9

1 30 . T C 36.19 PASS AC=1;AF=0.100;AN=10;BaseQRankSum=0.712;ClippingRankSum=0.00;DP=16;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.100;MQ=30.46;MQRankSum=1.98;QD=5.17;ReadPosRankSum=0.303;SOR=0.892 GT:AD:DP:GQ:PGT:PID:PL ./.:0,0:0:.:.:.:0,0,0 0/0:1,0:1:3:.:.:0,3,37 0/0:2,0:2:6:.:.:0,6,74 0/0:4,0:4:9:.:.:0,9,135 0/1:5,2:7:66:0|1:30_T_C:66,0,246 0/0:2,0:2:6:.:.:0,6,49
1 45 . A G 33.97 PASS AC=1;AF=0.100;AN=10;BaseQRankSum=1.09;ClippingRankSum=0.00;DP=23;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.100;MQ=30.65;MQRankSum=2.20;QD=3.77;ReadPosRankSum=0.765;SOR=1.179 GT:AD:DP:GQ:PGT:PID:PL ./.:0,0:0:.:.:.:0,0,0 0/0:1,0:1:3:.:.:0,3,37 0/0:5,0:5:15:.:.:0,15,157 0/0:6,0:6:1:.:.:0,1,155 0/1:7,2:9:63:0|1:30_T_C:63,0,288 0/0:2,0:2:6:.:.:0,6,49
1 53 . C CA 24.57 PASS AC=1;AF=0.083;AN=12;BaseQRankSum=1.09;ClippingRankSum=0.00;DP=24;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.083;MQ=30.65;MQRankSum=2.20;QD=2.73;ReadPosRankSum=0.765;SOR=1.179 GT:AD:DP:GQ:PGT:PID:PL 0/0:1,0:1:3:.:.:0,3,37 0/0:1,0:1:3:.:.:0,3,37 0/0:5,0:5:15:.:.:0,15,157 0/0:6,0:6:1:.:.:0,1,169 0/1:7,2:9:63:0|1:30_T_C:63,0,288 0/0:2,0:2:6:.:.:0,6,49

My .fam file looks like this
Cflo 1 0 0 0 5047.16
Cflo 12 0 0 0 6249.9
Cflo 13 0 0 0 6007.21
Cflo 15 0 0 0 7123.6
Cflo 2 0 0 0 5581.36
Cflo 9 0 0 0 7462.87

Thank you! Please let me know if you require more information!

Issue · Github
by Sheila

Issue Number
2476
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @dsap
    Hi,

    I will check with the team and get back to you soon.

    -Sheila

  • Hi,
    I have exactly the same problem.
    VCFs have been generated with HaplotypeCaller --emitRefConfidence GVCF, merged with GenotypeGVCFs, Filtered with -SelectVariants SNP and -VariantFiltration.
    my metadata.fam is formatted as:
    ID112 ID112 0 0 -9 -9
    ID113 ID113 0 0 -9 -9
    ID114 ID114 0 0 -9 -9
    ID115 ID115 0 0 -9 -9

    java.lang.ArrayIndexOutOfBoundsException: -1
    A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209)

    Thanks

  • @dsap
    If you need bed/bim/fam files you can convert your vcf with plink --vcf --make-bed
    Hope this helps

  • dsapdsap Member

    Thanks @esalvi, I have tried that as well, but plink tells me "missing header line in .vcf file" and I do not know what it could be as it it looks like a normal VCF. My problem might be that I am dealing with >24,000 scaffolds from a nonmodel organism rather than a model organism with few chromosomes.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Does it happen with any record in the VCF or only a subset? Can either of you isolate a record or subset of records that the program chokes on?

  • @Geraldine_VdAuwera I am not sure how to go about that.

    I used the "head" unix command to isolate the first 25,000 lines, about ~24,000 of which are the commented VCF IDs and scaffold numbers.

    And it did not work anyway.

    The first error I get after the arrayindexoutofboundsexception: -1 is "at htsjdk.variant.variantcontext.GenotypeLikelihoods.getGQLog10FromLikelihoods(GenotypeLikelihoods.java:220)"

    Could it have to do with a missing ID field?

  • benjaminpelissiebenjaminpelissie Madison, WIMember

    Hello,
    I have the same problem. It happens with my full-size VCF (~82G), but also with the ones I down-sampled to 64G, 2.8G and even 263M. I made sure to remove non-variant loci and validate my VCFs with ValidateVariants. I am using 30 cores on a 32 cores server with 128G of shared memory.
    Ben

  • benjaminpelissiebenjaminpelissie Madison, WIMember

    Here is the error message I get:

    ERROR --
    ERROR stack trace

    java.lang.ArrayIndexOutOfBoundsException: 1
    at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.parseMetaData(VariantsToBinaryPed.java:512)
    at org.broadinstitute.gatk.tools.walkers.variantutils.VariantsToBinaryPed.initialize(VariantsToBinaryPed.java:164)
    at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
    at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:323)
    at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:123)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:256)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:158)
    at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:108)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.8-0-ge9d806836):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions https://software.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: 1
    ERROR ------------------------------------------------------------------------------------------
  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @benjaminpelissie @dsap
    Hi,

    Can you submit a bug report? I will need to have a look locally to figure out what is going on. Instructions are here.

    Thanks,
    Sheila

  • @Sheila I have submitted the bug report called "GATK_bugreport_dsap.tar". Thanks!

    Issue · Github
    by Sheila

    Issue Number
    2503
    State
    open
    Last Updated
    Assignee
    Array
    Milestone
    Array
  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @dsap
    Hi,

    Thanks, I will have a look soon.

    -Sheila

Sign In or Register to comment.