Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

per-sample DP is missing in called genotypes

dmytrodmytro SwedenMember
edited April 2016 in Ask the GATK team

I try to filter both variant non-variant sites together. I see the only reasonable way to do it is to filter by the per-sample DP.
However, I noticed that substantial fraction of called sites (~10%) have missing value in DP field ( . instead of 0 or other value). Although many of sites with missing DP values indeed have low coverage with bad mapping and called ./., some sites have in my view good coverage. When I check the bam file (-bamout result) in IGV, I see many reads mapped to those sites with good quality (MQ=60). The genotype is usually called correctly, but for some unclear for me reason GATK doesn't report DP values. The AD values are usually 0,0 in such sites. Interestingly, the nearby sites have DP values reported.
When I check gVCF files, these sites usually have DP=0. Assuming coverage of 0 I could discard these sites, but I see in a bam file that it is not 0. So, I do not want to throw away 10% of my data.
I notice a tendency that such site is usually homozygot for ALT allele. I provide an example of such a site below. See the 1st sample in the position 13388742. (1/1:0,0:.:3:1|1:13388738_T_C:45,3,0)
I generated the data using HaplotypeCaller in GVCF mode and then GenotypeGVCFs.
Could you please tell me the reason why this is so?

VCF:
scaffold_1 13388742 . A G 8529.41 . AC=35;AF=0.833;AN=42;BaseQRankSum=-1.730e-01;ClippingRankSum=-6.940e-01;DP=247;ExcessHet=0.4083;FS=9.162;InbreedingCoeff=0.1415;MLEAC=37;MLEAF=0.881;MQ=38.83;MQRankSum=2.51;QD=32.26;ReadPosRankSum=1.56;SOR=0.043 GT:AD:DP:GQ:PGT:PID:PL 1/1:0,0:.:3:1|1:13388738_T_C:45,3,0 ./.:0,0:0 ./.:2,0:2 ./.:4,0:4 ./.:0,0:0 ./.:3,0:3 0/0:20,0:20:0:.:.:0,0,577 0/1:0,1:1:11:0|1:13388692_G_T:81,0,11 1/1:0,10:10:33:1|1:13388724_G_A:495,33,0 ./.:0,0:0 1/1:1,19:20:27:1|1:13388742_A_G:979,27,0 1/1:0,20:20:35:1|1:13388724_G_A:944,35,0 1/1:0,1:1:6:.:.:90,6,0 1/1:0,7:7:24:1|1:13388724_G_A:360,24,0 0/1:0,2:2:30:0|1:13388692_G_T:165,0,30 1/1:0,5:5:15:1|1:13388742_A_G:225,15,0 1/1:0,20:20:60:1|1:13388724_G_A:900,60,0 1/1:0,8:8:30:1|1:13388724_G_A:450,30,0 1/1:0,15:15:48:.:.:720,48,0 0/1:3,28:31:42:0|1:13388742_A_G:1167,0,42 ./.:3,0:3 1/1:0,1:1:3:1|1:13388687_C_T:45,3,0 1/1:0,2:2:12:1|1:13388692_G_T:180,12,0 ./.:4,0:4 ./.:6,0:6 1/1:0,6:6:18:1|1:13388724_G_A:270,18,0 ./.:4,0:4 1/1:0,14:14:45:1|1:13388724_G_A:675,45,0 1/1:0,3:3:9:1|1:13388692_G_T:135,9,0 0/0:21,0:21:0:.:.:0,0,533 1/1:0,14:14:45:1|1:13388724_G_A:655,45,0

gVCF:
scaffold_1 13388740 . C <NON_REF> . . END=13388741 GT:DP:GQ:MIN_DP:PL 0/0:4:12:4:0,12,162 scaffold_1 13388742 . A G,<NON_REF> 31.82 . DP=0;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;RAW_MQ=0.00 GT:GQ:PGT:PID:PL:SB 1/1:3:0|1:13388738_T_C:45,3,0,45,3,45:0,0,0,0 scaffold_1 13388743 . A <NON_REF> . . END=13388743 GT:DP:GQ:MIN_DP:PL 0/0:5:15:5:0,15,214

Screenshot of a BAM:

Answers

Sign In or Register to comment.