MQ and Multisample calling from GVCFs
Dear GATK team,
I'm getting puzzled with the MQ distribution coming out of our multisample calling.
Our procedure is:
- We start form a set of GVCF files created with GATK 3.5 HaplotypeCaller in BP_RESOLUTION mode for ~70 samples
- We combine them with CombineGVCFs (GATK 3.5)
- We call them with GenotypeGVCFs (GATK 3.5 first, GATK 3.8 now)
With GATK 3.5 we had an odd MQ distribution (deeply underscored), but apparently it was reported as a known bug.
Then we updated to GATK 3.8, now the MQ distribution for MQ<60 looks normal, but ~10% of the positions now have MQs>60 (to values up to ~700).
If it can help, I noticed that some of these ultra high scores originated from positions in which RAW_MQ is not specified in none of the samples' gvcfs. But generally they correspond to variants with high MQ (~60).
Any explanation? How should I treat these MQ>60 values?
Thanks a lot for your support!