This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Differentiating between uncalled reference alleles and sites with insufficient read depth
We are working with both WES and Genome Scan data on large families. Our genome scan has help us narrow down our search to a 2 Mb sequence that is shared IBD between all the Affecteds in one of our families. However, we've used both the UnifiedGenotyper and HaplotypeCaller to call VCFs in this region and the search fails to find any variants that segregate with the Affecteds sequenced. Unfortunately, I can't differentiate between loci that aren't emitted to the VCF because they were called as having a reference allele, or if they are absent because the samples didn't have enough reads align to that site. Is there anyway to call a VCF file, on a base-by-base level, so that we can check Genotype, Allele Depth, Genotype Quality, etc for each of our samples within this range, whether the variant called is a reference or alternate allele?
The DepthOfCoverage tool seems to have some of this functionality, but only provides a summary of the read depth at a base-by-base resolution. DiagnoseTargets only allows for analysis over aggregate intervals.
I apologize if this is a stupid question. I've been working with GATK for about a month now, so I am still very new.