Complete this survey about your research needs and be entered to win an Amazon gift card or FireCloud credit.
Read more about it here!
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.6 is out. See the GATK4 beta page for download and details.

Error in UnifiedGenotyper when calling a haploid genome

gilgigilgi Member
edited July 2012 in Ask the GATK team

Dear GATK team,

I tried calling SNPs and indels for a haploid genome, according to the instructions:

java -jar GenomeAnalysisTK.jar -R fasta_file.fasta -pnrm POOL -T UnifiedGenotyper -I my_merged_recal_realigned.bam -o snps.raw.vcf --sample_ploidy 1 --genotype_likelihoods_model POOLBOTH

But I am getting the error below.

Am I missing something?

.
.
.
INFO 15:19:40,055 TraversalEngine - chr06:254172 3.72e+06 11.6 m 3.1 m 30.6% 37.7 m 26.2 m
INFO 15:19:44,943 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.ArrayIndexOutOfBoundsException: 0
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolIndelGenotypeLikelihoods.getLikelihoodOfConformation(PoolIndelGenotypeLikelihoods.java:198)
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolGenotypeLikelihoods.calculateACConformationAndUpdateQueue(PoolGenotypeLikelihoods.java:553)
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolGenotypeLikelihoods.computeLikelihoods(PoolGenotypeLikelihoods.java:512)
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolIndelGenotypeLikelihoods.add(PoolIndelGenotypeLikelihoods.java:171)
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolIndelGenotypeLikelihoods.add(PoolIndelGenotypeLikelihoods.java:65)
at org.broadinstitute.sting.gatk.walkers.genotyper.PoolGenotypeLikelihoodsCalculationModel.getLikelihoods(PoolGenotypeLikelihoodsCalculationModel.java:242)
at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoods(UnifiedGenotyperEngine.java:277)
at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyperEngine.calculateLikelihoodsAndGenotypes(UnifiedGenotyperEngine.java:190)
at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:350)
at org.broadinstitute.sting.gatk.walkers.genotyper.UnifiedGenotyper.map(UnifiedGenotyper.java:117)
at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:65)
at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:62)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:269)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.0-0-g4c0ffd4):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: 0
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • delangeldelangel Broad InstituteMember

    This seems to be a genuine bug - just to help us narrow it down, do you see it when you specify -glm POOLSNP and/or -glm POOLINDEL as well?
    Is it one single sample or multiple samples you are calling simultaneously?

  • gilgigilgi Member

    Thank you so much for the quick reply!

    I have a merged bam of multiple samples (but each sample is an individual haploid strain - I still need to use the "pool" commands right?)

    I tried -glm POOLSNP and this works good.
    -glm POOLINDEL gave the same error as POOLBOTH.

    Please let me know if you want me to check more things on my side.

  • delangeldelangel Broad InstituteMember

    Oh I see you're using 2.0-0. If you update to the latest published version (I think we're in 2.0-23) does the problem still happens? We fixed several UnifiedGenotyper-related bugs in the meanwhile.

  • gilgigilgi Member

    I downloaded the 2.0-23 and still getting the error:

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.0-23-ge9a19be):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: 0
    ERROR ------------------------------------------------------------------------------------------

    Besides the error - just to make sure - conceptually - is it OK to use the "POOL" options, though my data isn't pool but rather individual strains?

  • delangeldelangel Broad InstituteMember

    Yes - the "POOL" naming is a bit historical since the motivation for its development was to call pools and then later we realized we could use the same modules for generalized ploidy calling. In fact, we'll be simplifying the arguments in a future release.

  • gilgigilgi Member

    OK,thanks a lot. I thought so, but wanted to be sure.
    So do you think that currently there is a bug in calling indels for haploid genomes?

  • delangeldelangel Broad InstituteMember

    Yes, but it's not present all the time - I'm suspecting it's a corner condition the code is not handling correctly. In the site where the error happens, do you have coverage in all samples? Does it only happen if you set -ploidy 1 but not a larger value? (even if larger values are non-sensical in your application, it may help us understand the problem).

  • delangeldelangel Broad InstituteMember

    Also, if you specify "-maxAlleles 1" does the problem still happen?

  • Thanks for all help!
    I checked, and I don't have coverage at all samples at the site it happens.
    I tried, and it happens also if I set it only happen if you set -ploidy 2
    It happens also when I add -maxAlleles 1

  • delangeldelangel Broad InstituteMember
    Accepted Answer

    I put in a potential fix in the latest GATK - not sure if it'll solve your problem but can you pls download the latest and check? thanks

  • Thanks a lot!!! This is working!

Sign In or Register to comment.