Service note: Geraldine is on vacation this week; other members of GSA will be responding to questions, but they have a lot of work besides this, so be aware that responses may be a little slower than usual. Thank you for your patience.

Filtering VCF passed to --knownSites or --known on-the-fly?

PeteHaitchPeteHaitch Posts: 19Member

My current workflow for analysing mouse exome-sequencing (based on v4 of Best Practices) can require me to use slightly different VCFs as --knownSites or --known parameters in BQSR, indel realignment etc. Basically, I have a "master" VCF that I subset using SelectVariants. The choice of subset largely depends on the strain of the mice being sequenced but also on other things such as AF'. It'd be great to be able to do this on-the-fly in conjunction with--known' in tools that required knownSites rather than having to create project-specific (or even tool-specific) VCFs.

Is there a way to do this that I've overlooked? Is this a feature that might be added to GATK?

Post edited by PeteHaitch on

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 2,239 admin
    Answer ✓

    Hi Pete,

    If you mean something like the -L option for intervals, but that would select a subset of variants from within a VCF instead, then no, that's not currently a feature. If there were to be significant demand for such a feature we may consider it, but right now you shouldn't count on it, sorry. If you or someone else wants to implement the feature we'd certainly be happy to look at a patch.

    Good luck!

Answers

  • PeteHaitchPeteHaitch Posts: 19Member

    Hi Geraldine,

    I'm thinking of options like in VariantFiltration, e.g. --filterExpression "AB < 0.2 || MQ0 > 50". I doubt I'll have time myself to implement such a thing but I thought I'd raise it here in case it was something of interest to other GATK users. For now I'll continue to just work around it.

    Thanks.

Sign In or Register to comment.