The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
JEXL queries in VariantFiltration can fail silently
The following JEXL filterExpression is "wrong", but VariantFiltration will process a VCF with it and produces a VCF identical to the input i.e., it fails silently:
java -jar /pub16/davidw/GenomeAnalysisTK.jar \ -T VariantFiltration \ -R genome_sequences/GCF_000172575.2.fna \ -V test_SNPs.vcf \ --filterExpression "\"QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0\"" \ --filterName "standard_hard_filter" \ -o test_SNPs_bash_doublequote.vcf
I realise the filterExpression is not valid JEXL but my concern is that GATK does not report it as such. In this case the extra quotes are from an upstream bug in a wrapper around GATK so clearly not GATK-related.
My expectation would be GATK (or the Java library is uses for JEXL) to indicate to the user no filtering will be done on any VCF because the JEXL makes no sense. I think silently not filtering is not ideal behaviour from GATK, although I realised a responsible user should do some manual checks and fix their expression accordingly.
It could be that inner three statements are properly parsed and some variants would be filtered. If that's the case I'm curious how GATK would be interpreting the outer two (with the rogue quotation marks).