GATK HaplotypeCaller produces no output

grahametheringtongrahametherington Norich, UKMember

Hi, A colleague was experiencing a very long run-time for a GATK HaplotypeCaller run and asked me to look at it. I noticed that although it had been running for about 5 days, it hadn't even created a vcf file and showed no progress on the stderr. The last line after 5 days of runtime was:

INFO  13:00:27,433 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files

I extracted one of the scaffolds from the assembly (fasta and bam), created indexes for them and tested GATK on the minimal dataset and I could also replicate the problem, i.e. no progress and no output . I've attached the bam, fasta, indexes and command line and was wondering if you could identify why GATK seems to stall before analysing the bam file. The bam file is only a few MB, so I'd expect GATK to only take a few minutes to create output, but this is obviously not the case.
Many thanks,

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭

    Hi Graham,

    I will check with the team what is going on here and get back to you.


  • grahametheringtongrahametherington Norich, UKMember

    Hi Geraldine,
    Many thanks for looking into this. We'll apply some cut-offs to the contig lengths (as you point out, there are many short contigs in the assembly which are probably not informative) and then supercontig the remaining longer sequences.

