We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Too high QUAL scores in Haplotypecaller gvcf

Hello @Geraldine,
I'm getting some too-high QUAL scores in my VCF, the whole file is full of weird scores in the tens of thousands:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
chrM 73 . G A 27120.46 . BaseQRankSum=-8.22
0e-01;ClippingRankSum=1.02;DP=1063;MLEAC=6;MLEAF=0.600;MQ=59.91;MQ0=0;MQRankSum=0.
120;QD=32.10;ReadPosRankSum=-3.840e-01 GT:AD:DP:GQ:PL 1/1:1,332:333:99:10644
,962,0 1/1:1,315:316:99:10159,913,0 0/0:118,0:118:99:0,120,1800 1/1:0,19
6:196:99:6366,586,0 0/0:95,0:95:99:0,120,1800
chrM 146 rs72619361 T C 16226.15 . DB;DP=1315
;MLEAC=2;MLEAF=0.200;MQ=57.18;MQ0=0;QD=34.24 GT:AD:DP:GQ:PL 0/0:338,0:338:99:0,1
20,1800 0/0:324,0:324:99:0,120,1800 0/0:118,0:118:99:0,120,1800 0/0:17
5,0:175:99:0,120,1800 1/1:0,359:359:99:16268,1102,0
chrM 150 . T C 52405.15 . BaseQRankSum=0.310;ClippingRankSum=0.695;DP=1523;MLEAC=8;MLEAF=0.800;MQ=60.00;MQ0=0;MQRankSum=-4.940e-01;QD=30.63;ReadPosRankSum=-1.364e+00 GT:AD:DP:GQ:PL 1/1:0,421:421:99:14544,1264,0 1/1:0,415:415:99:14206,1244,0 0/0:118,0:118:99:0,120,1800 1/1:1,206:207:99:7027,586,0 1/1:0,356:356:99:16670,1132,0
chrM 152 rs117135796 T C 16299.15 . DB;DP=1495;MLEAC=2;MLEAF=0.200;MQ=57.18;MQ0=0;QD=29.09 GT:AD:DP:GQ:PL 0/0:409,0:409:99:0,120,1800 0/0:411,0:411:99:0,120,1800 0/0:118,0:118:99:0,120,1800 0/0:204,0:204:99:0,120,1800 1/1:0,352:352:99:16341,1102,0
chrM 194 . C T 12039.15 . BaseQRankSum=-1.325e+00;ClippingRankSum=-4.920e-01;DP=1754;MLEAC=2;MLEAF=0.200;MQ=60.00;MQ0=0;MQRankSum=-1.597e+00;QD=32.89;ReadPosRankSum=1.11 GT:AD:DP:GQ:PL 0/0:409,0:409:99:0,120,1800 0/0:411,0:411:99:0,120,1800 1/1:6,360:366:99:12081,883,0 0/0:204,0:204:99:0,120,1800 0/0:361,0:361:99:0,120,1800
It's just a 4x WGS file, nothing fancy.
Any idea of why this might be?
Thanks,
Answers
Because you are looking at MT snps (MT coverage is going to be much higher than the rest of the genome...yours > 1000)
@Kurt,
That's right, the hg19 genomes have chrM in the front. My Chr1's with qual's int eh 100's look pretty normal, do you agree?
I ask because I'm doing some detective work on this dataset - they seem to be having a false positive bias in the whole SNP set, and I'm trying to pinpoint what might be causing it.
EDIT: I should qualify that the bias is post-filtration. The actual file looks like this, and I apply the usual best-practices filter:
Well, I m certainly going to defer on this one (definitely not the best person to ask), but 4x coverage times 5 samples given the allele freqs, having QUALs in the 100s seem reasonable to me.
Yeah, looks reasonable to me. You can try plotting the distributions of the various annotations, see if anything looks suspicious. If this data went through VQSR, see if the recalibration plots look reasonable as well.
@Geraldine_VdAuwera said:
Is it possible that this VariantFiltration warning flag might be a utile hint? I don't think I have any FS annotation anywhere in the set. VariantAnnotator doesn't seem to know what the FS field is, though it's recommended best practices.
Do you mean VA is erroring out when you try to get it to annotate FS? If so that's because you have to give it the full name, ie FisherStrand.