The frontline support team will be offline as we are occupied with the GATK Workshop on March 21st and 22nd 2019. We will be back and available to answer questions on the forum on March 25th 2019.
InbreedingCoeff in VCF not matching my calculation
Initially I was going to ask about why I might be seeing some InbreedingCoeff values less than -1 (based on the calculation as explained at https://software.broadinstitute.org/gatk/documentation/article.php?id=8032, I believe it should always be between -1 and 1). But then I checked a random sample of 10,000 InbreedingCoeff values from my VCF against the values I calculate myself, and I see a strange mismatch generally, not only in the IC values given by GATK as less than -1. Attached is my code and a plot, with the y = x line in blue and y = -1 in red. I see the same pattern in another dataset which was produced by the same GATK-based pipeline.
The VCF referred to in the attached document is generated from 157 exome-capture samples (unrelated individuals) using GATK 3.6 with JDK 1.8.0. We use HaplotypeCaller on each sample, then GenotypeGVCFs on the collected results, then VariantRecalibrator/ApplyRecalibration with the recommended parameters/resources. I can provide the full commands if helpful.
Any ideas about why there is this mismatch? Is there something I'm misunderstanding about the InbreedingCoeff values?