We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Can't SelectVariants based on sample's GQ

averydavisaverydavis BostonMember, Broadie

I have consistently failed to SelectVariants from a gvcf containing only one sample with -select "GQ > 60.0" (or other variations, such as -select "GQ > 60"): the output vcf always ends up with a header but no entries, though there are many, many entries that meet this threshold. I've come up with a workaround (VariantsToTable, filtering with awk, then SelectVariands --keepIDs ), but it's memory inefficient and longterm I would like to be able to select sites (especially homozygous reference sites) based on GQ. This article about JEXL queries suggests what I've done should work fine (see section 4), but this forum post shows that others have had this problem a couple years ago.

I have been able to get SelectVariants to pull out homozygous reference calls by including -sn samplename -select 'vc.getGenotype("samplename").isHomRef()', but whenever I include any -select query based on GQ, the resulting file is blank (when this is included alone or in conjunction with other calls, including .isHomRef()).

Is there another trick I should try? Has anyone had success filtering on a sample's GQ? Other suggestions for pulling out high-quality homozygous reference sites from a gvcf containing data from just one sample?

Other information/detail that may or may not be relevant:
My gvcf is BP_RESOLUTION, not in block format.
I get no error messages from SelectVariants, just blank output.
No standard SelectVariants options give me trouble, just genotype-level filters (and not all of them - as I mentioned .isHomRef() works fine).

All thoughts welcome - thanks!



Sign In or Register to comment.