seen variant in a genome browser doesn't appear in vcf file, using HaplotypeCaller

Hi GATK team,

I ran HaplotypeCaller on a bam file that I pre-pprocessed according to your Best-Practices.
When looking in a genome browser at this bam, I encountered a position that seems to have a variant, but the variant doesn't appear in the vcf file.

This is the line from the vcf file (the variant I'm talking about is at position 75680995):
3 75680992 . C . . END=75680995 GT:DP:GQ:MIN_DP:PL 0/0:155:0:147:0,0,0

Attached is the picture from the genome browser - At the same position (75680995) there is a variant with the allele 'A', while the reference is 'C'. The coverage at this position is 195, and 171 of those reads are for the 'A' allele.

Why is this happen?

Thank you,

Best Answers


  • SheilaSheila Broad InstituteMember, Broadie, Moderator


    Hi Maya,

    This is probably due to the fact that Haplotype Caller realigns active regions. Realignment improves accuracy because it is easier to align small regions of reads vs the whole genome.

    You can try using the -bamout argument to see what the realigned reads look like. I think you will find the variation that you see in the original bam file is no longer in existence after realignment.


  • mayaabmayaab IsraelMember

    Hi Sheila,
    I used the -bamout parameter, and I see that contigs were created. What's strange is that there are positions that the coverage there was 0 (in the input bam file), and now it is covered. I can't understand - where did this coverage come from?

    attached is a screen shot from a genome viewer. the upper track is the input bam to HaplotypeCaller. the lower track is the output of Haplotype caller. There are contigs on the left.

  • mayaabmayaab IsraelMember

    Thanks a lot!
    I was looking for this kind of document, but couldn't find it.
    That was very informative and helpful.

  • mayaabmayaab IsraelMember

    One more question: where do the DP and AD values come from?
    For example, there is a position that in the vcf file the DP is 147, according to the input bam file the DP is 195, and according to the bam file that HaplotypeCaller created it is 0.


  • mayaabmayaab IsraelMember
Sign In or Register to comment.