This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
GATK 3.2-2 N+1 HaplotypeCaller missing Alt alleles
I've been look at variant calling with the 3.2-2 version of the GATK HaplotypeCaller and it is failing to record high quality alternative alleles in the GVCF file.
In the attached IGV screenshot I'm looking at a in 3 individuals (from the top) grandsire, dam & child. As you can see they all clearly show a Heterozygous site GS 5 alt BQ26-29, Dam 7 Alt BQ27-31, Child 10 alt BQ26-33 however GATK has only called a variant for the Child and Dam. The GS though it has 5 alt & 9 ref is called Ref/Ref with an AD of 14,0 when it should be 9,5.
When I extract the site from the GVCF file I see this:
chr1 9590826 . T <NON_REF> . . END=9590826 GT:DP:GQ:MIN_DP:PL 0/0:14:0:14:0,0,142
For some reason the HC GVCF has failed to recognise the 5 alt alleles and instead reported 15 Ref, with a PL of 0,0,142 (so it knows something is wrong). If I look at the GVCF record for the dam & child they are correct:
chr1 9590826 rs380224633 T A,<NON_REF> 215.18 . BaseQRankSum=-0.996;ClippingRankSum=-1.630;DB;DP=18;MLEAC=1,0;MLEAF=0.500,0.00;MQ=60.00;MQ0=0;MQRankSum=-1.721;ReadPosRankSum=2.174 GT:AD:DP:GQ:PL:SB 0/1:11,7,0:18:99:235,0,428,268,449,717:5,6,5,2 chr1 9590826 rs380224633 T A,<NON_REF> 334.18 . BaseQRankSum=-0.033;ClippingRankSum=-0.429;DB;DP=23;MLEAC=1,0;MLEAF=0.500,0.00;MQ=60.00;MQ0=0;MQRankSum=-1.154;ReadPosRankSum=-0.692 GT:AD:DP:GQ:PL:SB 0/1:12,10,0:22:99:354,0,421,390,451,841:8,4,2,8
I've seen several dozen cases like this in the last 20mins. Is this a known bug or do you need data to replicate it?