Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
HaplotypeCaller and GenotypeGVCFs sensibility on heterozygous variants
Hello, I recently compared results from GATK best practices (bwa, Picard, HaplotypeCaller, GenotypeGVCFs) with a snp array set (a high confident known variant detection method) for 6 samples (data from Illumina Hiseq2500) and got a really interesting confusion matrix.
This means that GATK (as any other caller), has troubles by calling heterozygous variants. We are discussing the causes of this phenomenon and how HC+GG deal with it.
At first we though it is a DP problem and yes, it is: when filtering variants with DP>20 het column transformed in:
This means that the proportion of ref/alt bases is critical when calling heterozygous variants.
We hope you can give us more ideas on the causes of this problem and how can we move those wild-called het variants to called variants, even at the cost of getting more false positives.
We used bwa 0.7.10-r789 and gatk 3.7-0-gcfedb67