To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

VariantFiltration bug with filtering QualByDepth

Hi folks:
My version info:
/home/sn/software/java/bin/java -jar /home/sn/software/GATK/GenomeAnalysisTK.jar --version
3.8-0-ge9d806836

I've being using GATK HC -- GenotypeGVCFs -- VariantAnnotator -- SelectVariants -- VariantFiltration to get var calling and annotation as the best practise suggested. At GenotypeGVCFs and VariantAnnotator I tried to keep all possible filters such QD, FS, etc., and at SelectVariants I separated the SNPs and InDels. At last using VariantFiltration to both, where strange happens.

For SNPs:
/home/sn/software/java/bin/java -jar /home/sn/software/GATK/GenomeAnalysisTK.jar \ -T VariantFiltration \ -R /home/yangjy/16T/GEN0ME/test/db/GCF_000006825.1_ASM682v1_genomic.fna \ -V /home/yangjy/16T/GEN0ME/test/_tmp_combined_calling/raw_snp.vcf \ --filterName filter_snp \ --filterExpression "QD < 2.0" \ -jdk_deflater -jdk_inflater \ -o /home/yangjy/16T/GEN0ME/test/_tmp_combined_calling/filtered_snp.vcf \ 2>> /home/yangjy/16T/GEN0ME/test/_tmp_combined_calling/_log_call_30091258.txt
And I got warning like this:
WARN 09:13:43,288 Interpreter - ![0,2]: 'QD < 2.0;' undefined variable QD

Yet when I check the output vcf file, I can still find correct filter been applied like this:
##FILTER=<ID=filter_snp,Description="QD < 2.0">

One-line example:
NC_002663.1 86042 . C T 417.87 filter_snp ABHom=0.667;AC=2;AF=0.667;AN=3;BaseQRankSum=-3.530e-01;ClippingRankSum=0.00;DP=469;FS=3.973;GC=44.55;GQ_MEAN=81.00;GQ_STDDEV=31.18;HRun=0;MLEAC=2;MLEAF=0.667;MQ=35.84;MQRankSum=-1.087e+01;NCC=0;OND=0.333;QD=1.34;ReadPosRankSum=-1.397e+00;SNPEFF_AMINO_ACID_CHANGE=Y3757;SNPEFF_CODON_CHANGE=taC/taT;SNPEFF_EFFECT=SYNONYMOUS_CODING;SNPEFF_EXON_ID=1;SNPEFF_FUNCTIONAL_CLASS=SILENT;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=PM_RS00295;SNPEFF_IMPACT=LOW;SNPEFF_TRANSCRIPT_ID=TRANSCRIPT_gene61;SOR=0.505;Samples=GX-PM-1,GX-PM-3;VariantType=SNP GT:AD:DP:GQ:PL 1:79,85:164:99:405,0 0:158,0:158:99:0,294 1:77,70:147:45:45,0

Notice that the FILTER field is "filter_snp" and the QD = 1.34. And it seems that for InDels the QD filter works correctly with no warnings. So I think it's a harmless bug.

I didn't check if this was reported or fixed, if so, please let me know and thanks for reading my post. Great useful tool set you guys been making. I've learned a lot.

Best,
Su Na

Best Answer

Answers

  • @Sheila said:
    @Su_Na
    Hi Su,

    Sorry for the delay. Yes, this warning is just to let you know some sites do not have the QD annotation. You can ignore it safely.

    -Sheila

    Yeah, just this morning I've found only one out of 12k SNP callings' QD was NA and that should be the key to my misunderstanding. Thanks for your answer, one less problems to go~

Sign In or Register to comment.