This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
unified genotyper vs. haplotype caller w/ pedigree study, major discrepancies
I've been exploring de novo mutation identification in the context of a pedigree of trios. I've run the UnfiedGenotyper (UG) given all the bam files for ~25 sets of trios and it appears to identify a set of de novo mutations. When I run the HaplotypeCaller (HC) pipeline, first generating gVCF files for each individual, and then using the merged gVCF files along with the pedigree for genotype refinement and de novo mutation calling, it also finds a number of de novo mutations annotated as hi-confidence de novo mutations. When I compare the UG de novo mutations to the high confidence HC list, there's very little overlap. Many of the UG hi-confidence de novo variants are called by HC, but listed as low-confidence de novo variants, and from looking at a few examples, it would appear that the HC calls have assigned lower genotype confidence levels for the parental (non-mutated, reference) genotypes. Could it be that because the gVCF files aren't storing position-specific information for the reference (non-mutated) positions in the genome, the pedigree-type de novo mutation calling is not as accurate as it could be? Should I be generating gVCFs that include position-specific information?
Many thanks for any insights. If it would help, I could post some examples.