GATK HaplotypeCaller produces no output

Hi, A colleague was experiencing a very long run-time for a GATK HaplotypeCaller run and asked me to look at it. I noticed that although it had been running for about 5 days, it hadn't even created a vcf file and showed no progress on the stderr. The last line after 5 days of runtime was:

INFO  13:00:27,433 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files

I extracted one of the scaffolds from the assembly (fasta and bam), created indexes for them and tested GATK on the minimal dataset and I could also replicate the problem, i.e. no progress and no output . I've attached the bam, fasta, indexes and command line and was wondering if you could identify why GATK seems to stall before analysing the bam file. The bam file is only a few MB, so I'd expect GATK to only take a few minutes to create output, but this is obviously not the case.
Many thanks,
Graham

Issue · Github
by Sheila

Issue Number
2095
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
vdauwera

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @grahametherington
    Hi Graham,

    I will check with the team what is going on here and get back to you.

    -Sheila

  • grahametheringtongrahametherington Norich, UKMember

    Hi Geraldine,
    Many thanks for looking into this. We'll apply some cut-offs to the contig lengths (as you point out, there are many short contigs in the assembly which are probably not informative) and then supercontig the remaining longer sequences.
    Thanks,
    Graham

Sign In or Register to comment.