Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
InbreedingCoeff in VCF not matching my calculation
Initially I was going to ask about why I might be seeing some InbreedingCoeff values less than -1 (based on the calculation as explained at https://software.broadinstitute.org/gatk/documentation/article.php?id=8032, I believe it should always be between -1 and 1). But then I checked a random sample of 10,000 InbreedingCoeff values from my VCF against the values I calculate myself, and I see a strange mismatch generally, not only in the IC values given by GATK as less than -1. Attached is my code and a plot, with the y = x line in blue and y = -1 in red. I see the same pattern in another dataset which was produced by the same GATK-based pipeline.
The VCF referred to in the attached document is generated from 157 exome-capture samples (unrelated individuals) using GATK 3.6 with JDK 1.8.0. We use HaplotypeCaller on each sample, then GenotypeGVCFs on the collected results, then VariantRecalibrator/ApplyRecalibration with the recommended parameters/resources. I can provide the full commands if helpful.
Any ideas about why there is this mismatch? Is there something I'm misunderstanding about the InbreedingCoeff values?