Interpreting '.' in AD field of
I'm doing a variant analysis of genomic DNA from 2 related samples. I followed the up-to-date Best practices using HaplotypeCaller in GVCF mode for both samples followed by GenotypeGVCF to compute a common vcf of variant loci.
I'm looking at variants that would be sample2-specific (present in sample2 but not in sample1)
Here is a line of this file:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2
chrIII 91124 . A AATAAGAGGAATTAGGCT 1132.42 . AC=2;AF=0.500;AN=4;DP=47;FS=0.000;MLEAC=2;MLEAF=0.500;MQ=58.85;MQ0=0;QD=7.99 GT:AD:DP:GQ:PL 1/1:0,25:25:55:1167,55,0 0/0:.:22:33:0,33,495
In the Genotype Field, sample2.AD is a . (dot) meaning that no reads passed the Quality filters. However, sample2.DP=22 meaning that 22 reads covered this position.
This line suggest that this variation is specific to sample1 (genotype HomVar 1/1) and is not present in sample2 (HomRef 0/0). But given the biological relationship between sample1 and 2 (the way they were generated), I doubt that this variation is true: it is very likely to be present in sample2 as well. It's a false
I have 416 loci like this. For the vast majority of them, sample1 and 2 likely share the same variation. But since it is not impossible that a very few of them are really sample1=HomVar and sample2=HomRef, could you suggest me a way to detect those guys?
What about comparing sample1.PL(1/1) and sample2.PL(0/0) ? For example could you suggest a rule of thumb to determine their ratio ?