This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
JEXL queries in VariantFiltration can fail silently
The following JEXL filterExpression is "wrong", but VariantFiltration will process a VCF with it and produces a VCF identical to the input i.e., it fails silently:
java -jar /pub16/davidw/GenomeAnalysisTK.jar \ -T VariantFiltration \ -R genome_sequences/GCF_000172575.2.fna \ -V test_SNPs.vcf \ --filterExpression "\"QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0\"" \ --filterName "standard_hard_filter" \ -o test_SNPs_bash_doublequote.vcf
I realise the filterExpression is not valid JEXL but my concern is that GATK does not report it as such. In this case the extra quotes are from an upstream bug in a wrapper around GATK so clearly not GATK-related.
My expectation would be GATK (or the Java library is uses for JEXL) to indicate to the user no filtering will be done on any VCF because the JEXL makes no sense. I think silently not filtering is not ideal behaviour from GATK, although I realised a responsible user should do some manual checks and fix their expression accordingly.
It could be that inner three statements are properly parsed and some variants would be filtered. If that's the case I'm curious how GATK would be interpreting the outer two (with the rogue quotation marks).