Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

SelectVariants and discordance

nikmalnikmal Posts: 23Member

Greetings GATK team!

I hope I'm not making a duplicate question here, but I couldn't find anything regarding this in the forum.

Basically, what I want to do is to use SelectVariants to filter against another call set, but I do not want to be as strict as using -discordance (i.e. 100% discordance rate between the two call sets). I want to say for example: "filter call set A against variants that occur in >90% of call set B".

Is there a way to do this with JEXL expressions maybe?

Kind regards

Best Answer

Answers

  • nikmalnikmal Posts: 23Member
    edited May 2013

    @Geraldine_VdAuwera said: Hi there, sorry to get to your question so late.

    I want to say that this should be possible to do with JEXL but off the top of my head I can't think of a straightforward way to do this in a single step. Maybe instead, do a first round of selecting variants that occur above your desired threshold in call set B, then filter call set A using discordance vs. that subset.

    If you come up (maybe already done so) with a good way to do this please share your solution with the community, as I'm sure others will be interested. Eventually we'd like to put together a "cookbook" of good variant selection and filtering solutions.

    Hi Geraldine,

    No problem, in fact I already solved the problem like you described. I wrote a simple script in Python that counts the occurrence of a variant and keeps it according to a specified cutoff. After that, I used SelectVariants with --discordance vs the output from the script (i.e. the subset of variants that occur in max. X% of the population).

    I will make the script available as soon as I can, in case someone is interested.

    EDIT:

    Here is a link to the script, feel free to modify it as you wish: https://gist.github.com/keyoke1337/5676846

    Post edited by nikmal on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Thanks for sharing your solution!

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.