MuTect2 tumorOnly vs paired loses true variants

Hi GATK team !

I have an issue with MuTect2. I'm using GATK last version (nighlty build from 16th of March) in a somatic context on an amplicon design.

I have a variant that I know is true one (although the depth of coverage at this position is quite low in the somatic context).
MuTect does call the variant when in tumor only mode : first one if the 3 here
chr13 32900222 . C T . clustered_events;homologous_mapping_event ECNT=3;HCNT=45;MAX_ED=45;MIN_ED=41;NLOD=0.00;TLOD=39.33 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/1:85,36:0.149:36:0:.:2087,972:83:0
chr13 32900263 . G A . clustered_events;homologous_mapping_event ECNT=3;HCNT=16;MAX_ED=45;MIN_ED=41;NLOD=0.00;TLOD=8.56 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/1:1020,18:8.696e-03:18:0:1.00:26997,444:1020:0
chr13 32900267 . C T . clustered_events;homologous_mapping_event ECNT=3;HCNT=5;MAX_ED=45;MIN_ED=41;NLOD=0.00;TLOD=6.81 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/1:1024,16:0.015:16:0:0.00:27735,411:1024:0

It seems like it considers the 3 variants to be clustered and on the same haplotype, that may be important for my issue ? Although 222 from 263 is already quite far away..

When calling in paired mode, feeding it with the recalibrated germline bam file, I have no variants left, even though neither of these 3 variants is a germline one.

Could you please tell my the reason why those variants are filtered out ? Is there a parameter I should play with ?

thanks a lot
Manon

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    Can you please post the original BAM and bamout BAM for both the tumor and normal in the region?

    Thanks,
    Sheila

  • manon_sourdeixmanon_sourdeix FranceMember
    edited April 2016

    Sure ! Here you go.

    first picture is the original bams. Top is tumor, bottom is normal.
    Second picture is the bamout. Top from the paired computation, second from the tumor only.

    As you can see, no reads left in the region when doing a paired analysis.

    Same parameters for the 2 commands except the input_file:normal. Same interval.
    In the GATK output info I have

    paired
    2 reads were filtered out during the traversal out of approximately 11349 total reads (0.02%)

    tumorOnly
    0 reads were filtered out during the traversal out of approximately 1752 total reads (0.00%)

    Thanks a lot for your help !

    image

    image

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    I have a few suggestions. Can you try adding -forceActive and -disableOptimizations to your commands? I think the normal sample might be showing evidence for the SNP as well, so it is considered germline in the paired analysis.

    -Sheila

  • manon_sourdeixmanon_sourdeix FranceMember
    edited May 2016

    Hi @Sheila
    Sorry for the delay in my tests I was on another project for the past month.
    I tried your 2 parameters and yes, it does recover the variant in the paired mode. Attached is the bamout file from the paired computation using -forceActive and -disableOptimizations.

    Do you think I should be using the 2 as a routine when analyzing in paired mode ? Do you think this may be due to the amplicon design ?

    image

    Thanks a lot.
    Manon

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    Okay. So, in the above post, the reads are from tumor, normal and artificial haplotypes. Can you please post the IGV screenshot with the reads colored by sample? I would like to see if the normal has evidence for the variant at the site.

    Thanks,
    Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    To be clear, those two arguments are just troubleshooting parameters and should not affect results. We certainly don't recommend using them systematically because they will significantly slow down execution.
  • manon_sourdeixmanon_sourdeix FranceMember

    @Geraldine_VdAuwera Slowing down execution does not matter here as soon as it doesn't miss true positives (diagnostic context). I prefer having more false positives than missing true somatic variants. So this is just a matter of slowing down ?

  • manon_sourdeixmanon_sourdeix FranceMember

    @Sheila the reads are already colored by sample but the coverage is high so I cannot show it all in the screen shot.
    Can I transfer the bam in private maybe ?

    Otherwise from what I see there are 3 colours -> the blue ones are the tumor ones and harbour the variant, the green ones are the normal ones and do not harbour the variant at all, the red ones are the artificial and very few have the variant

    What do you think ?
    Thanks !

    Manon

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    Okay, so you are saying the tumor reads only show the variant, and none of the normal reads show the variant? In that case perhaps it is best if you submit a bug report. Instructions are here.

    Thanks,
    Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @manon_sourdeix Using these arguments should not make any differences to what calls are made or not made. It just provides more information about what the variant caller sees.

  • manon_sourdeixmanon_sourdeix FranceMember

    Hey @Sheila
    sorry again for the delay !
    I did submit a bug report under ManonSourdeix_Mutect2_tumorOnly_vs_paired

    Hope this can be fixed !
    Thanks again
    Manon

    Issue · Github
    by Sheila

    Issue Number
    1006
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    chandrans
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    Sorry for the delay. I am having a look now. Can you tell us how you know the variant at position 32900222 is true?

    -Sheila

    Issue · Github
    by Sheila

    Issue Number
    1449
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    chandrans
  • manon_sourdeixmanon_sourdeix FranceMember

    Hi
    Sorry for the delay.
    We analyzed this data with another tool that identified the variant and then we went back to the tumor sample, designed primers and checked it by Sanger : it was there.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @manon_sourdeix
    Hi Manon,

    Thanks for letting us know. The developers are focused on improving Mutect2 now, so you can expect some nice changes soon :smile:

    As for your issue, I put in a feature request for users to be able to change the thresholds for determining active regions.

    -Sheila

  • manon_sourdeixmanon_sourdeix FranceMember

    Ok thank you, for now I will use Mutect in a non-paired mode to be sure I do not miss anything.

Sign In or Register to comment.