We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Does GATK4 Mutect2 support Force Calling?

cbaocbao Member, Broadie ✭✭

Does GATK4 Mutect2 support Force Calling? If yes, could you let me know how to do it? Thanks a lot!



Issue · Github
by shlee

Issue Number
Last Updated
Closed By

Best Answers


  • cbaocbao Member, Broadie ✭✭

    Thanks @shlee,

    Say that we have some known variants (e.g. VCF file from GATK4 mutect2) in sample A and want to force them to call mutations in sample B. The key is to determine the genotype of sample B by the intervals of these variants. In the research of cancer evolution, it is important to figure out whether the mutations in late grade samples are "true negative" or just filtered out in early grade samples. That is the main reason why I want to run force calling. I really appreciate your help!


  • rehkerrehker Member

    I think, cbao has made a very strong point for his case.
    I would use a function like that to gain an idea if variants that have been called in tumors can be seen in unrelated normals.
    To clarify, I want to get a better idea what kind of artifacts can be found in my data before I develop and apply strategies to remove them. I will definitely go ahead and tinker with the tumor-lod parameters.

    Have there been additional developments for this issue that I might have been missing lately?

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @rehker Mutect2 has supported force-calling for about a year (the answer above was correct at the time it was written). Just add the argument -alleles force-called-variants.vcf

  • cbaocbao Member, Broadie ✭✭

    Thanks for your [email protected]
    From v4.0.6.0, the force-calling mode is supported by GATK4.

    Is it possible to improve this a little bit in the future version of Mutect2/X?
    I found some "redundant" variants in the force called results and want to remove them. Say I detected a DEL (chr1:2157510) in sample CA1 by Mutect2 and want to know what's going on in sample CA3 in the same site. So, I used a VCF file containing this site to run Mutect2 in force-calling mode. However, I eventrually got two differnet variants in this site across differnet samples. I know the second one is just an extented DEL in CA3 (it happens in real world). But, you know, sometimes, we just need a file in a special format to run downstream analysis, like, a MAF file only containing sites in force-called-variants.vcf. In this case, I need the sites in CA1.maf...CA8.maf are exactly same to infer the phylogenetic relationships in this cohort. It is hard to find all of the variant groups like this and then aggregate each one. It would be very helpfull if you can keep the current force-calling mode and add a new mode to fix this tiny issue when you have time.

    Hugo_Symbol Chromosome Start_position End_position Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8
    SKI 1 2157510 2157510 5'Flank DEL T T - 2 4 NA 5 4 1 3 NA
    SKI 1 2157510 2157511 5'Flank DEL TT TT - NA NA 3 NA NA NA NA 2

    Not urgent. (I got the MAF files I wanted by a python program using pysam.pileup)


  • rehkerrehker Member

    @davidben said:
    @rehker Mutect2 has supported force-calling for about a year (the answer above was correct at the time it was written). Just add the argument -alleles force-called-variants.vcf

    As even listed in the most recent mutect documentation :* (I've been living under a rock..)
    Sorry, just got confused by shlees last sentence.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @cbao I see your point. I will have to think about how easily that could be implemented.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    As even listed in the most recent mutect documentation :* (I've been living under a rock..)

    @rehker I have never seen Game of Thrones. That's living under a rock.

  • rehkerrehker Member

    it's me again.

    Unfortunately sometimes the forced calling does not work.

    Here is my command line:
    java -Xmx15G -jar '/media/jan/dk1/data/programs/var-call/GATK/gatk-' Mutect2 -R '/media/jan/disk1/data/resources/Agilent_HG19/hg19.fasta' -I '/media/jan/disk1/data/TMB/2018_06_01_TMB_Qia/2018_06_01_Qia_TMB/X_17_3818_S3_realigned_sorted.bam' -tumor X_17_3818_S3 -O X_17_3818_test_forced_C_15_26550.vcf --native-pair-hmm-threads 8 --initial-tumor-lod 0 --tumor-lod-to-emit 0 -alleles '/media/jan/disk1/data/TMB/2018_06_01_TMB_Qia/2018_06_01_Qia_TMB/C_15_26550_S3_filter.vcf'

    These are two variants I cannot find.

    chr1 1115945 . G A . clustered_events;panel_of_normals CONTQ=93;DP=33;ECNT=3;GERMQ=34;MBQ=0,20;MFRL=0,83;MMQ=60,60;MPOS=22;PON;POPAF=4.49;ROQ=25;SEQQ=93;STRANDQ=93;TLOD=80.70 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:0,33:0.971:33:0,0:0,31:0,0,15,18
    chr1 1115958 . A C . clustered_events;haplotype;panel_of_normals CONTQ=93;DP=33;ECNT=3;GERMQ=34;MBQ=0,20;MFRL=0,83;MMQ=60,60;MPOS=9;PON;POPAF=4.49;ROQ=30;SEQQ=93;STRANDQ=93;TLOD=114.19 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:0,33:0.971:33:0,0:0,32:0|1:1115958_A_C:1115958:0,0,15,18

    Mapping and base quality look fine. I am actually surprised that the normal 'non forced' calling does not work either on these. But with the alleles option I would have expected at least an entry for that variant in my output.

    So basically I got calls for the alignment at the bottom of the screenshot but not for the one on top. Am I missing something here? I attached the part of the alignment padded by ~200bp in case someone wants to take a closer look.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @rehker These alleles seem to be filtered in the force-calling vcf. In order to force-call both PASS and filtered alleles you need to use the --genotype-filtered-alleles option in Mutect2. Please let me know if this does not resolve the issue!

  • rehkerrehker Member

    There are still positions missing, but it might be that those just had no coverage (or only reads that did not pass quality filters) at all in the bam-file. At least this was the case for the ones I checked.

    Similar to Chunyang I am employing a python script for my issue right now, but it lacks in precision and speed, while recalling the variants in multiple alignments. So switching to GATK only could actually help me a lot.


  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @rehker If there is no coverage left after read filtering Mutect2 is also unable to force call. Given the architecture of the GATK engine this could only be avoided inelegantly. We hope, however, that we have closed all other loopholes so that if a force-calling allele is absent it means there was no coverage at that locus. If you ever find this not to be the case, we wish to correct it.

Sign In or Register to comment.