We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Why VariantFiltration did not filter my SNPs following the JEXL expression I gave.

nkobmoonkobmoo ParisMember

Hi GATK team again !

I have been trying to do the hard filter on my SNPs. What I want to do is to filter out all snps with depth less than 300 and more than 2000, and also all the SNPs with the following features: QD < 10, MQ < 20, FS >20, SOR > 4, MQRankSum < -5 and ReadPosRankSum < -5.
Here is my command:

java -Xmx4g -jar ~/path/to/GenomeAnalysisTK.jar \
-T VariantFiltration \
-R ~/path/to/reference \
-V NK_jointSNP.vcf \
--filterExpression "DP < 300 || DP > 2000 || QD < 10 || MQ < 20 || FS < 20 || MQRankSum < -5 || ReadPosRankSum < -5 || SOR > 4" \
--filterName "snp_filter_1st_round" \
-o filtered_snp.vcf &

Beside the error messages "undefined variable QD", SNPs which should not pass my filter seem to pass the filter as they were tagged "PASS" on the FILTER field....

For example,

PASS AC=3;AF=0.188;AN=16;BaseQRankSum=-2.800e-01;ClippingRankSum=0.742;DP=372;ExcessHet=3.0103;FS=5.152;MLEA
C=3;MLEAF=0.188;MQ=18.78;MQRankSum=0.742;QD=33.91;ReadPosRankSum=1.40;SOR=3.321
PASS AC=4,4;AF=0.200,0.200;AN=20;BaseQRankSum=0.536;ClippingRankSum=0.742;DP=364;ExcessHet=3.0103;FS=0.000;M
LEAC=4,4;MLEAF=0.200,0.200;MQ=19.80;MQRankSum=0.949;QD=30.96;ReadPosRankSum=0.124;SOR=1.493
PASS AC=1;AF=0.036;AN=28;DP=697;ExcessHet=3.4242;FS=0.000;MLEAC=1;MLEAF=0.036;MQ=12.76;QD=33.28
;SOR=3.258
PASS AC=2;AF=0.080;AN=25;DP=727;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=0.080;MQ=7.42;QD=30.73;
SOR=3.056
PASS AC=2;AF=0.077;AN=26;DP=727;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=0.077;MQ=7.44;QD=24.88;
SOR=3.056
PASS AC=1;AF=0.032;AN=31;DP=768;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.032;MQ=11.33;QD=32.28
;SOR=1.179
PASS AC=1;AF=0.040;AN=25;BaseQRankSum=-1.068e+00;ClippingRankSum=1.46;DP=606;ExcessHet=3.0103;FS=3.074;MLEAC=1;MLEAF=0.040;MQ=46.00;MQRankSum=-2.892e+00;QD=0.57;ReadPosRankSum=-1.824e+00;SOR=0.670
PASS AC=4;AF=0.250;AN=16;BaseQRankSum=0.464;ClippingRankSum=1.96;DP=441;ExcessHet=3.0103;FS=4.223;MLEAC=4;MLEAF=0.250;
MQ=34.18;MQRankSum=-3.640e+00;QD=9.86;ReadPosRankSum=2.25;SOR=0.359

As you can see, SNPs with uncorresponding values are there....it seems also that, when this happens, SNPs pass the filter for MQ, they do not pass for QD and vice-versa.... The other annotations do not seem to be affected...

Am I making any mistake with my JEXL expression? I used "&&" as I do want my SNPs to correspond to all the criterial not just at least one and I used "||" between the two expressions on DP as a snp cannot have in the same time <300 and >2000 of depth...

I would like to have your input.

Thank you very much in advance. You guys are doing great job in supporting our community !

Best regards,
Noppol.

Best Answer

Answers

Sign In or Register to comment.