We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

HaplotypeCaller java.lang.IllegalArgumentException: Unexpected base in allele bases

bwubbbwubb Member ✭✭
edited October 2018 in Ask the GATK team

Greetings,

I am receiving an error message very similar to https://github.com/broadinstitute/gatk/issues/4525 when attempting to run gatk 4.0.10.1 HaplotypeCaller in GGA mode. My intent was simple, to take the het alleles from one sample and genotype them in another.

gatk HaplotypeCaller -R ~/resources/Genomes/Human/GRCh37/human_g1k_v37.fasta -L data/work/TB5050/S0760415/gatk/haplotype_caller.het_sites.vcf.gz -I bam_input/final/TB5050-T1/GRCh37/TB5050-T1.ready.bam -O data/work/TB5050-T1/S0760415/gatk/germline_het_sites.vcf.gz --genotyping-mode GENOTYPE_GIVEN_ALLELES --alleles data/work/TB5050/S0760415/gatk/haplotype_caller.het_sites.vcf.gz

...

11:18:47.862 INFO  HaplotypeCaller - Shutting down engine
[October 25, 2018 11:18:47 AM EDT] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.18 minutes.
Runtime.totalMemory()=2076049408
java.lang.IllegalArgumentException: Unexpected base in allele bases 'GGCAGGCGGAGGTTGCGGTGAGCCAGGATCGCGCCACTGCACTCCAGCCGGGGCAAAAAGAGCAAAACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAGC*GGGGGCGGTTTCAGGGATAAAAGTGGGGAATCCTCGGAGCTTTTCCAGCCGGCCCTCCCGGTCGCCCTTTGCAGTGCTTGGCGCCCCTGTGCCGGCCTTC'
        at htsjdk.variant.variantcontext.Allele.<init>(Allele.java:165)
        at org.broadinstitute.hellbender.utils.haplotype.Haplotype.<init>(Haplotype.java:40)
        at org.broadinstitute.hellbender.utils.haplotype.Haplotype.<init>(Haplotype.java:49)
        at org.broadinstitute.hellbender.utils.haplotype.Haplotype.insertAllele(Haplotype.java:209)
        at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.composeGivenHaplotypes(ReadThreadingAssembler.java:180)
        at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.readthreading.ReadThreadingAssembler.runLocalAssembly(ReadThreadingAssembler.java:116)
        at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.AssemblyBasedCallerUtils.assembleReads(AssemblyBasedCallerUtils.java:259)
        at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(HaplotypeCallerEngine.java:538)
        at org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller.apply(HaplotypeCaller.java:240)
        at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(AssemblyRegionWalker.java:291)
        at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse(AssemblyRegionWalker.java:267)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:966)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)

I was uncertain if this was a bug still? Its quite possible my vcf is not proper for GGA, although it validates just fine. I think it breaks at the first * seen in ALT. Is this an event that can not be genotyped in GGA? Im do not know what the command would be to remove those, I cant seem to make SelectVariants or vcftools to work to that effect, (time to awk I guess).

The only thing I have left to do is upgrade to 4.0.11.0 which I just saw existed this morning. Any comments/advice would be greatly appreciated. Thank you.

-bwubb

Best Answer

Answers

  • bwubbbwubb Member ✭✭

    @bwubb said:
    ...I think it breaks at the first * seen in ALT. Is this an event that can not be genotyped in GGA? Im do not know what the command would be to remove those, I cant seem to make SelectVariants or vcftools to work to that effect, (time to awk I guess).
    ...

    Ugh, I posted the solution in my question. I mean that got me further, but then I hit a

    java.lang.IllegalStateException: Allele in genotype CCA* not in the variant context [A*, *, C]
    

    Which is preposterous, I dont see CCA* or anything like it anymore, all the multiallelic sites have been split and all * removed.

  • bwubbbwubb Member ✭✭

    Stepping further back I believe I was able to produce a allele file that works. It is still unclear what misstep I took in going from a jointly-called multisample vcf to a single sample, het snps/indels vcf. Perhaps I didnt use the -TYPE snps and indels flags when I thought I did? Im going to mark as answered and perhaps this will a reference to others.

  • bwubbbwubb Member ✭✭
    Accepted Answer

    All instances of '*' should be excluded from your --alleles vcf file.

Sign In or Register to comment.