We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Using Mutect2 on FFPE samples to call germline variants

Hello everyone,
I've followed the topics on Mutect2 and its way to perform reduction of FFPE artifacts from tumor tissue.
However my task is a little different: I would like to use Mutect2 and its way to remove FFPE artifacts in "normal" tissue, to assess germline variants. It is possible to clean from artifact using Mutect2 and subsequently filter out the possible germline variants? And how can I do it?
Thanks for all the help


  • manolismanolis Member ✭✭✭

    Hi, I'm also interested in this field!


  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @manolis & @chelsif - I will follow up with the developers and get back to you.

  • chelsifchelsif Member

    Thanks Tiffany! waiting for your feedback!

  • chelsifchelsif Member

    @Tiffany_at_Broad just for infos: we run a test, using the "classical" pipeline for WES, using HaploTypeCaller, with two samples:1-from blood, 2-from FFPE tissue. Raw results show that from sample 1(fresh) there were around 30K variants, while from sample 2(FFPE) there were around 60K. Ok, two different samples, but I do believe that still there is a lot of false variants, due to FFPE artifacts...
    many thanks for any possible help

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @manolis & @chelsif

    The recommended way to use Mutect2 to filter FFPE artifacts would be to follow this tutorial section "A step-by-step guide to the new Mutect2 Read Orientation Artifacts Workflow."
    Pass your germline normal in as an input as well so that Mutect2 only calls somatic variants. You can do that with these arguments:
    -I tumor.bam \
    -tumor tumor_sample_name \
    -I normal.bam \
    -normal normal_sample_name \

    Let me know if this answers your question.

  • manolismanolis Member ✭✭✭
    edited July 2019

    Hi @Tiffany_at_Broad and @chelsif,

    maybe I misunderstand your question-answer therefore I will try to give you more information and some pipeline inputs about my case.

    In my case I have wgs data from a FFPE-healthy sample (no tumor sample or tumor/normal samples) and I'm looking for novel or very-rare germline variants.

    If I'm going to use the "Mutect2 Tumor only mode couple by the Read Orientation Artifacts Workflow", without any PoN file and without a germline source at the end I will have a vcf file which includes the germline variants and the somatic variants (the somatic variants are the background considering that it isn't a tumor sample) and exclude the FFPE artifacts.

    After that I can annotate my vcf with the gnomAD db so that I can mark all known variants (rare and common). In a third step I can exclude the gnomAD annotated variants and at the end I will have in my vcf only the somatic variants (background) and the novel variants.
    In the end, if I suppose that a germline Heterozygous variant has an AF 40-60% and an Homozygous variant an AF >90% of the reads, I could filter out the somatic variants (background) and keep all possible germline variants.

    Could this be a way to find the germline variants in a FFPE-healthy sample?

    Do you have any suggestions about this workflow and filtering steps?


  • chelsifchelsif Member

    Thanks @Tiffany_at_Broad , but I'm a little confused, you said:

    Pass your germline normal in as an input as well so that Mutect2 only calls somatic variants.

    but I need the germline variants..

    You can do that with these arguments:
    -I tumor.bam \
    -tumor tumor_sample_name \
    -I normal.bam \
    -normal normal_sample_name \

    I do not have a tumor sample..you mean I can use the second argument to pass to Mutect2 my "normal" sample?
    Moreover, as @manolis suggested, if I use Mutect2 without any PoN and no normal sample, in Read Orientation Artifacts Workflow, I will end with a vcf without FFPE artifacts, but from this file, how can I filter only the germline variants?
    @manolis suggested to use the gnomAD db as a reference..Is that correct? Also, how to exclude the somatic variants?
    thanks for all the help

  • chelsifchelsif Member
    edited August 2019

    HI @Tiffany_at_Broad ,
    I was wondering if I can use the filtered.vcf from Mutect2 to combine with the output of HaplotypeCaller , by using bcftools -isec, to obtain a unique vcf with only the sites that are not coming from FFPE artifact.
    The only problem that has to be done for each sample separately and it would be impossible to run HaplotypeCaller in gvcf mode. Any way to go around this problem?
    also can it be a correct approach?
    Another possibility that came to my mind is to use the --bamout , to produce a .bam file that then will be feeded to HaplotypeCaller, similarly to have a filterede bam... Can this be used?
    Thanks for all the help..

    Post edited by chelsif on
  • chelsifchelsif Member

    Done some research on the internet and I found the following tool:
    seems to operate on the same basis of M2. So, could it be a good workaround?

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @chelsif @manolis

    I would avoid SOBDetector. You can compare our documentation: https://github.com/broadinstitute/gatk/blob/master/docs/mutect/mutect.pdf with their paper to see why.

    A reasonable, albeit somewhat CPU wasteful approach, is:

    • Run the Mutect2 pipeline with the extra argument --genotype-germline-sites.
    • Obtain (with grep or maybe SelectVariants) a vcf of all M2 outputs with the orientation filter.
    • Run the HaplotypeCaller pipeline.
    • Filter (with SelectVariants --discordance) out the M2 pipeline's orientation bias artifacts from the HaplotypeCaler output.

    You could also run the Mutect2 pipeline and extract any variants that either PASS or are filtered only with the germline filter. I hesitate to suggest that approach because Mutect2 is not designed to maximize somatic precision, not germline sensitivity, and therefore it is likely to be quite liberal in what it calls germline. Perhaps if you used this approach along with a simple filter for sufficient allele fraction it would work.

Sign In or Register to comment.