We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Help understanding genotype in spanning deletion notation


I am working with a collaborator who had sequencing and variant calling done at the Broad.
The resulting multi-sample VCF has spanning deletion notation.
I am trying to understand what the genotype call is for the patient at the following site, where I have suppressed columns:

chr2 178535858 . GA G,GAA ... GT:AD:DP:GQ:PL 0/1:38,8,5:51:30:30,0,822,74,770,1497
chr2 178535859 rs202214630 A *,G ... GT:AD:DP:GQ:PL 0/2:19,8,24:51:99:752,342,731,0,301,685

It seems to me a case that isn't handled in the spanning deletions tutorial, where more than two alleles appear in a patient with a spanning deletion: https://software.broadinstitute.org/gatk/documentation/article?id=6926

The sample has reads with reference (38), 1 bp deletion (8) and 1 bp insertion (5) at 178535858 and reference (19), spanning deletion (8) and variant (24) at 178535859. How do I tell which is the GATK determined genotype at 178535859? I don't understand how the 0/1 at 178535858 and 0/2 at 178535859 give a diploid genotype.

From the VCF header, I see that the HaplotypeCaller Version used is Version=

Thanks for your help,


Sign In or Register to comment.