How to interpret a very broad distribution of QUAL
Dear GATK team,
We applied the best practice DNA-seq pipeline and generated SNP/indel variant calls from targeted sequencing of 700Kb in 500 samples.
According to this GATK post, the typical QUAL score ranges between 2 to 63.
We looked at "PASS" variants/sites and found that the distribution of QUAL in our data is very broad with min=30, median=6195, max=10M. Is is distribution normal or an indication of abnormality of this callset? If this is abnormal, what should we be looking at for trouble shoot?
We note that the concordance between genotypes from our targeted sequencing and known genotypes in the same samples are >99% - seems satisfactory.
Thanks in advance.