This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Help understanding genotype in spanning deletion notation
I am working with a collaborator who had sequencing and variant calling done at the Broad.
The resulting multi-sample VCF has spanning deletion notation.
I am trying to understand what the genotype call is for the patient at the following site, where I have suppressed columns:
chr2 178535858 . GA G,GAA ... GT:AD:DP:GQ:PL 0/1:38,8,5:51:30:30,0,822,74,770,1497
chr2 178535859 rs202214630 A *,G ... GT:AD:DP:GQ:PL 0/2:19,8,24:51:99:752,342,731,0,301,685
It seems to me a case that isn't handled in the spanning deletions tutorial, where more than two alleles appear in a patient with a spanning deletion: https://software.broadinstitute.org/gatk/documentation/article?id=6926
The sample has reads with reference (38), 1 bp deletion (8) and 1 bp insertion (5) at 178535858 and reference (19), spanning deletion (8) and variant (24) at 178535859. How do I tell which is the GATK determined genotype at 178535859? I don't understand how the 0/1 at 178535858 and 0/2 at 178535859 give a diploid genotype.
From the VCF header, I see that the HaplotypeCaller Version used is Version=18.104.22.168.
Thanks for your help,