The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

Query regarding passing of VCF files from MuTect to snpEff for annotation

ParthavJailwalaParthavJailwala MarylandMember

I am interested to annotate using snpEff, only those somatic mutations that were flagged as 'KEEP' in the judgement call column of the *callstats file generated from muTect. I can see that these 'KEEP' calls (in the callstats file) are flagged as 'PASS' in the 'Filter' column of the corresponding VCF file.

I am now not sure if I should filter the VCF output files from Mutect (keep only the 'PASS' calls) for snpEff annotation. snpEff does the Ti/Tv ratios apart from functional annotation, so is it expected to provide snpEff the unfiltered VCF for accurate calculation of Ti/Tv ratio OR is it Ok to provide a filtered VCF with only the passed calls?



Best Answer


  • ParthavJailwalaParthavJailwala MarylandMember

    Thanks Sheila for your response. I will use the filtered VCF with PASSED calls as input to snpEff.


  • vivekdas_1987vivekdas_1987 MilanMember

    @ParthavJailwala can you tell me if for normal/tumor pair if you have passed the mutect output(filtered) keeping only the KEEP or the PASS as you may call it and passing them for snpEff annotation, what have been your Ti/TV ratio for human samples. I am getting a very low ratio around .35 if I consider only the somatic mutations high confident ones that come from the Mutect and then annotate them. I would like your viewpoints if you have performed such calls.

  • ParthavJailwalaParthavJailwala MarylandMember

    @vivekdas_1987 As per the response from @Sheila, from the raw mutect output, I filtered to keep only the PASS/KEEP calls and carry out snpEff annotation. I have run this on several tumor-normal pairs. What I see is that the Ts/Tv ratio is in the range (3.5 to 7) across several tumor-normal pairs. I consider this to be higher than expected (and it is surprising to note that you report lower than expected). I am still not sure if checking Ts/Tv ratios on filtered calls would be a good idea as there may be a bias in the ratio of Ts and Tv events in the filtered high-confidence PASS calls vs that in the set of all calls reported by Mutect.

  • vivekdas_1987vivekdas_1987 MilanMember

    @ParthavJailwala Thanks for the quick response. Makes quite a sense. Can I ask another question regarding this but an off route one. It is regarding annotation, for exome data having reads corresponding to 70% in the target region (i.e) have you seen on annotating the mutations are mostly non exonic and around 30-40% are only on the exonic region? Have you ever come across such scenario's in your normal/tumor somatic calls?

  • ParthavJailwalaParthavJailwala MarylandMember

    @vivekdas_1987 Just so that I understand your query better, do you filter the 'off-target' calls by either one of the two ways before annotating mutect calls: a) Using the -L (--intervals) parameter with Mutect to make calls only in the 'on-target' regions, or b) Run mutect without any intervals, but manually filter out the 'off-target' calls from the mutect output ? My experience is that if you do filter out off-target calls before annotations, we see largely exonic mutations (45% to 50%), and if your target probes cover exons+UTR, then you will see about 10-15% of UPSTREAM, DOWNSTREAM & UTR calls.

  • vivekdas_1987vivekdas_1987 MilanMember
    edited September 2014

    @ParthavJailwala Yes I usually filter the "off-target" calls before annotating , I do it in BQSR stage where one can use -L option to keep only the on target regions for downstream mutation calls, so my bam files which is used for mutect calls is streamlined to underline the mutations that are mostly in the target site of the exons, for sure the kit which is used for target enrichment also contains exons +UTR. They are not totally covering just the exonic regions. Ok if the 45-50% mutations lie on the exonic region then it is ok for me. But do you think I should you the -L option again for calling even the SNPs with 'on-target' regions with Mutect?

Sign In or Register to comment.