We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Bug in output VariantFiltration with --genotypeFilterExpression

When using VariantFilteration with genotype filters, the FORMAT line for the genotype filter is wrong:
##FORMAT=<ID=FT,Number=1,Type=String,Description="Genotype-level filter">
It should have "Number=." like so:
##FORMAT=<ID=FT,Number=.,Type=String,Description="Genotype-level filter">
Since any genotype can have 0 or more filters associated with it. From the VCFv4.2 doc: "If the number of possible values varies, is unknown, or is unbounded, then this value should be ‘.’"
Its a small issue, I noticed it when pyvcf didn't properly parse the genotype filters to a list. On the plus side, it should be a one character bugfix, which is always nice.
To replicate, I used this command to trigger two filters on one genotype. This occurs both in gatk3.6 and gatk3.5.
gatk -T VariantFiltration \
-R $ref \
--variant $vcf_file \
--filterExpression "QUAL<1000.0" \
--filterName "Q1000" \
--genotypeFilterExpression "DP<20" \
--genotypeFilterName "DP20" \
--genotypeFilterExpression "DP<20" \
--genotypeFilterName "DP20_v2" \
-o $output \
--setFilteredGtToNocall
Best Answer
-
Geraldine_VdAuwera Cambridge, MA admin
Oh, good catch -- I think you're right. We'll put in a ticket to get this fixed.
Answers
Awesome, thanks a lot!
Hi @Redmar_van_den_Berg, this has been fixed in htsjdk and the version of htsjdk was revved in GATK, so the latest nightlies have the fix (since Sept 20 actually).
Hi @Geraldine_VdAuwera, thanks for getting back to me.
I've tested the latest nightly build but I think something went wrong, the FT annotation still shows "Number=1".
Nightly build:
Snipped from filtered VCF file, ##GATKCommandLine.VariantFiltration shows the nightly build (I used a vcf file generated with stable gatk).
@Geraldine_VdAuwera
The unfiltered vcf file did not have an FT annotation. I can see that the nightly build leaves existing modified annotations alone though, while gatk3.6 overwrites them, so that change has landed in the nightly build.
@Redmar_van_den_Berg
Hi,
Hmm. I just tested the latest nightly build, and I get the correct output. However, when I test the latest stable version (3.6), I get the "buggy" output. Can you post the exact command you ran? Also, can you please try the latest nightly build from last night? That is the one I am using and I can see a difference in the outputs.
Thanks,
Sheila
@Sheila I don't know whats going on...
Excerpt from .bash_history that shows the exact commands I used
First, I checked that the $vcf_file did not contain a FT annotation
Then I show the nightly build:
Then I run the filter command, and check the $output vcf file using vim:
@Redmar_van_den_Berg
Hi,
Can you please post the before and after VCF header line?
Thanks
Sheila
@Sheila
Hi,
What do you mean with the before and after VCF header line? The FT header gets added after the variant filtration step, so it is not present in the unfiltered VCF file.
However, when it gets added it shows
Which should be
We just wanted to be extra sure that the FT line wasn't already in there from a previous operation. Meanwhile I tried this myself on an unfiltered vcf and get the same result you do. There's clearly something unexpected going on so we'll check everything and get back to you.
OK, we've determined that there is a problem with our build system that is causing a stale version to be packaged as a nightly build. We didn't catch this because we do our fix-testing against a post-merge build, which normally should be equivalent to using the latest nightly -- except right now it's not. Engineers are looking into the problem now; in the meantime if you need the fix you can compile from source to get the correct latest build.