Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

Mutect2 missed variant called by HaplotypeCaller

jamalsjamals Member
edited August 1 in Ask the GATK team

Hi,

I am running GATK 3.5.0 with java version 1.8.0. I have two cell line samples that I paired with a promega baseline reference (its essentially a mixed germline sample) to run Mutect2 (which I am aware of is not a part of the Best Practices). I also ran the tumour sample a lone using the HaplotypeCaller and noticed a very clear ALK variant that was missed by Mutect2 but called by the HaplotypeCaller in both samples. Due to the nature of the cell line we also expected to see an ALK variant which is why it was detected.

What I find odd is that the local reassembly of Mutect2 seems to have discarded the variant as the bamout does not contain the variant (C > T) at loci chr2:29443695 whereas the HaplotypeCaller call does for both samples. I have read through the documentation and the specifics of the local reassembly and would be very interested in knowing at what stage this occurs and your suggestions on what can be done.

I will be trying GATK v.4.0 as well as some of the things mentioned here https://software.broadinstitute.org/gatk/documentation/article?id=1235 in the meantime I would be very greatful if someone could look into this. I will be posting the updates on my new tests as well. See details below on various metrics and IGV screenshots.

The chemistry is a DNA capture Kapa hyperplus kit, 75 paired end reads.

Sample 945

  • Entire ALK covered up to 80X
  • Mean/min coverage 1013/378
  • BWA bam shows 50% allele frequency

HaplotypeCaller line Sample 945

  • chr2 29443695 . G T 8496.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=5.863;ClippingRankSum=-0.368;DP=601;ExcessHet=3.0103;FS=0.536;MLEAC=1;MLEAF=0.500;MQ=62.46;MQRankSum=1.113;QD=14.21;ReadPosRankSum=0.502;SOR=0.76GT:AD:DP:GQ:PL 0/1:300,298:598:99:8525,0,8240

Sample 946

  • Entire ALK covered up to 80x
  • Mean/min coverage 523/204
  • BWA bam shows 49% allele frequency

HaplotypeCaller line Sample 946

  • chr2 29443695 . G T 5056.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=3.569;ClippingRankSum=-0.212;DP=397;ExcessHet=3.0103;FS=2.133;MLEAC=1;MLEAF=0.500;MQ=63.61;MQRankSum=-1.274;QD=13.00;ReadPosRankSum=0.063;SOR=0.595 GT:AD:DP:GQ:PL 0/1:199,190:389:99:5085,0,5319

Promega control sample

  • Same control sample used as pair for both 945 and 946 using Mutect
  • Coverage around ALK region ~200+

Please see IGV images of the various cases below. The --bamout (run together with disabling optimization and forcing output) command was run with a 500bp padding downstream and upstream of the target location that contains the variant (i.e the actual padding upstream and downstream the actual variant at loci 29443695 will be slighly more than 500bp). I also ran mutect with the adjust 500bp but included all the targets in chr2 without adding any padding on any other targets other than the one that contains the variant.

Sample945_bwaBAM - Bam output from BWA

Sample946_bwaBAM - Bam output from BWA

Sample945_GATKForcedBamOut

Sample946_GATKForcedBamOut

Sample945_MutectForcedBamOutChr2

Sample946_MutectForcedBamOutChr2

Sample945_MutectForcedBamOutALKOnly

Sample946_MutectForcedBamOutALKOnly

Thank you very much and I look forward hearing your thoughts on this
Sabri

Answers

  • jamalsjamals Member

    Hi,

    I just had the chance to run Mutect2 on GATK v.4.0.5.1 on one of the cell lines and still no luck. So far I just tried to pair the sample as was done before with the promega control sample just to see if anything would have changed.

    I will be creating a PON to try to run Mutect2 in line with the Best Practices but as that would take a bit of time to generate, I decided that it was worth giving it a shot pairing it with the promega sample.

    I do have one question regarding GATK4. How does one force the calculations to take place in active regions with no variants in GATK4? In GATK3 it was the following commands...

    • -forceActive
    • --disableOptimizations
    • --bamout

    I haven't had time to try tweaking various parameters neither but will be reading through the documentation to figure out what parameters could possibly be involved. If there are any suggestions I would be happy to try them out.

    If you are interested in the specific parameters used or the workflow I would be happy to share them as well.

    Lastly, is it possible that the variant gets classed as a possible germline variant early on in the variant call and if yes, is there anything that can be done against that or is it apart of the core of the caller?

    Thank you very much for your time
    Sabri

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @jamals
    Hi Sabri,

    Can you test this with the very latest version of GATK? Can you also post the exact commands you are running (HaplotypeCaller and Mutect2)?

    Thanks,
    Sheila

  • davidbendavidben BostonMember, Broadie, Dev ✭✭

    @jamals There are a few things you can try out on the GATK4 Mutect2 command line:

    • Adding the flag --genotype-germline-sites forces Mutect2 to assemble and genotype all variants, even if it thinks they are germline.
    • Setting --initial-tumor-lod lower than the default causes Mutect2 to trigger local assembly more readily. Since you're debugging a single site you could set it to 0.0.--
    • --genotyping-mode GENOTYPE_GIVEN_ALLELES --alleles <force-call.vcf> forces Mutect2 to assemble and genotype all variants in <force-call.vcf>.

    These only apply to recent releases of GATK 4 Mutect2.

Sign In or Register to comment.