Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

seen variant in a genome browser doesn't appear in vcf file, using HaplotypeCaller

mayaabmayaab IsraelMember ✭✭

Hi GATK team,

I ran HaplotypeCaller on a bam file that I pre-pprocessed according to your Best-Practices.
When looking in a genome browser at this bam, I encountered a position that seems to have a variant, but the variant doesn't appear in the vcf file.

This is the line from the vcf file (the variant I'm talking about is at position 75680995):
3 75680992 . C . . END=75680995 GT:DP:GQ:MIN_DP:PL 0/0:155:0:147:0,0,0

Attached is the picture from the genome browser - At the same position (75680995) there is a variant with the allele 'A', while the reference is 'C'. The coverage at this position is 195, and 171 of those reads are for the 'A' allele.

Why is this happen?

Thank you,
Maya

Best Answers

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @mayaab‌

    Hi Maya,

    This is probably due to the fact that Haplotype Caller realigns active regions. Realignment improves accuracy because it is easier to align small regions of reads vs the whole genome.

    You can try using the -bamout argument to see what the realigned reads look like. I think you will find the variation that you see in the original bam file is no longer in existence after realignment.

    -Sheila

  • mayaabmayaab IsraelMember ✭✭

    Hi Sheila,
    I used the -bamout parameter, and I see that contigs were created. What's strange is that there are positions that the coverage there was 0 (in the input bam file), and now it is covered. I can't understand - where did this coverage come from?

    attached is a screen shot from a genome viewer. the upper track is the input bam to HaplotypeCaller. the lower track is the output of Haplotype caller. There are contigs on the left.

  • mayaabmayaab IsraelMember ✭✭

    Thanks a lot!
    I was looking for this kind of document, but couldn't find it.
    That was very informative and helpful.

  • mayaabmayaab IsraelMember ✭✭

    One more question: where do the DP and AD values come from?
    For example, there is a position that in the vcf file the DP is 147, according to the input bam file the DP is 195, and according to the bam file that HaplotypeCaller created it is 0.

    Maya

  • mayaabmayaab IsraelMember ✭✭
Sign In or Register to comment.