This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
SelectVariants from FILTER column GATK4
I have a vcf annotated for several different tranches labelled in the FILTER column. I'm trying to use SelectVariants to pick sites from different tranches, but I can't get it to work. I'm using GATKv184.108.40.206.
Example VCF before filtering:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ANN0801 ANN0802 ANN0803 ANN0804 ANN0805 ANN0
HanXRQChr17 774 . G A 65.46 VQSRTrancheSNP99.00to100.00 AC=1;AF=3.704e-03;AN=270;Bas
HanXRQChr17 812 . G A 5839.94 VQSRTrancheSNP50.00to70.00 AC=53;AF=0.212;AN=250;BaseQR
HanXRQChr17 823 . A G 5378.42 VQSRTrancheSNP90.00to99.00 AC=49;AF=0.201;AN=244;BaseQR
HanXRQChr17 830 . C T 648.72 VQSRTrancheSNP99.00to100.00 AC=5;AF=0.021;AN=240;BaseQRa
HanXRQChr17 845 . T C 345.33 VQSRTrancheSNP99.00to100.00 AC=4;AF=0.020;AN=204;DP=428;
HanXRQChr17 852 . CA C 3030.40 PASS AC=38;AF=0.173;AN=220;BaseQRankSum=1.01;ClippingRank
HanXRQChr17 866 . T G 89.60 VQSRTrancheSNP99.00to100.00
Code I've tried:
gatk SelectVariants --variant tmp.vcf.gz -O tmp.tranche100.vcf.gz -select "FILTER == VQSRTrancheSNP99.00to100.00"
This crashes gatk because of an JEXL error
gatk SelectVariants --variant tmp.vcf.gz -O tmp.tranche100.vcf.gz -select "FILTER == 'VQSRTrancheSNP99.00to100.00'"
This runs, but produces a vcf file with zero sites.
Ultimately I'd like to select multiple different values from the FILTER column using '||' but for now I'm trying to get it to select one.