Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

'Cannot trim a Haplotype without containing GenomeLoc'

annafrangouannafrangou Posts: 1Member
edited September 2013 in Ask the GATK team

Hi,

I'm trying to call at known variant sites (human) in the Neanderthal (bam file), using HaplotypeCaller. I've altered the original vcfs (http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase1/analysis_results/integrated_call_sets/) to the following (the header is present, but I have reduced the number of columns to 8 to match the rest of the file. The ##INFO lines remain the same as the original file - I've removed them here). I have altered the ALT column to be whichever base call is not in the REF column, to allow for any alternative base to be called. I have also given QUAL, FILTER, and INFO dummy values (all periods), as I believe these columns are required, but the information is incorrect once I change the ALT variable.

##fileformat=VCFv4.1
##INFO
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
22      16050408        rs149201999     T       A,C,G   .       .       .
22      16050612        rs146752890     C       A,T,G   .       .       .
22      16051249        rs62224609      T       A,C,G   .       .       .
22      16051347        rs62224610      G       C,T,A   .       .       . 

My command line is the following:

java -jar /path/to/GenomeAnalysisTK-2.7-2-g6bda569/GenomeAnalysisTK.jar -T HaplotypeCaller -nct 30 -I -R /path/to/human_g1k_v37.fasta --genotyping_mode GENOTYPE_GIVEN_ALLELES --alleles /path/to/cut.dummy.chr22.vcf -I /path/to/AltaiNea.hg19_1000g.22.dq.bam -o /path/to/haplotypecaller_forcedcalls_chr22.vcf

but an error comes (and I can't find anything referring to it, so I've posted!):

INFO  19:04:19,514 ProgressMeter -     22:17778431        2.83e+09   35.3 m        0.0 s     91.8%        38.5 m     3.2 m 
INFO  19:04:21,267 GATKRunReport - Uploaded run statistics report to AWS S3 
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
java.lang.IllegalStateException: Cannot trim a Haplotype without containing GenomeLoc
        at org.broadinstitute.sting.utils.haplotype.Haplotype.trim(Haplotype.java:114)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.trimActiveRegion(HaplotypeCaller.java:983)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:875)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:750)
        at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:140)
        at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708)
        at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704)
        at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
##### ERROR
##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
##### ERROR Visit our website and forum for extensive documentation and answers to 
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR MESSAGE: Cannot trim a Haplotype without containing GenomeLoc
##### ERROR ----------------------------------------------------------------------

Any suggestions welcome!

Thanks,

Anna

Post edited by Geraldine_VdAuwera on

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,877Administrator, GATK Developer admin

    Hi Anna,

    Have you validated your vcf file to make sure that your edits didn't mess up the format somehow?

    Geraldine Van der Auwera, PhD

  • olioli Posts: 1Member
    edited October 2013

    Hi Geraldine,

    I observe the same behaviour, when trying to genotype against a VCF file that I generated using HaplotypeCaller.

    Executing

    java -Xmx4g -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R hs37d5.fasta --alleles blah.vcf --genotyping_mode GENOTYPE_GIVEN_ALLELES -I blah.bam -L 22 -o blah_22.vcf

    yields,

    INFO 14:27:23,145 ProgressMeter - 22:18020857 0.00e+00 22.5 m 2232.9 w 35.1% 64.1 m 41.6 m

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalStateException: Cannot trim a Haplotype without containing GenomeLoc at org.broadinstitute.sting.utils.haplotype.Haplotype.trim(Haplotype.java:114) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.trimActiveRegion(HaplotypeCaller.java:983) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.assembleReads(HaplotypeCaller.java:875) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:750) at org.broadinstitute.sting.gatk.walkers.haplotypecaller.HaplotypeCaller.map(HaplotypeCaller.java:140) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:708) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions$TraverseActiveRegionMap.apply(TraverseActiveRegions.java:704) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:273) at org.broadinstitute.sting.gatk.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.7-2-g6bda569):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Cannot trim a Haplotype without containing GenomeLoc
    ERROR ------------------------------------------------------------------------------------------

    This seems to arise when the HaplotypeCaller is instructed to genotype a large deletion ( > 100 bases). Is there a hard threshold on the maximum deletion size accepted by the HaplotypeCaller? If so could this please be patched such that the HaplotypeCaller accepts genotyping its own output?

    Thanks, Oliver

    Post edited by oli on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,877Administrator, GATK Developer admin

    Hi Oliver,

    This might be an issue with the maximum size of the active region that HC considers. Have a look at the Tech Doc here: http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_haplotypecaller_HaplotypeCaller.html

    Try increasing the --activeRegionMaxSize and see if that helps.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.