Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Why variant in germline-resource not filtered.

Hi,
I have paired samples and I called them separately using following command, got a variant call in normal sample, not in tumor sample.

chr8:139165272 should be pre-filtered because it is in the germline-resource database. But how it was not filtered in the Mutect2 result?

normal sample:

chr8    139165272       .       TG      CA      .       .       DP=115;ECNT=2;MBQ=36,40;MFRL=218,208;MMQ=60,60;MPOS=43;POPAF=7.30;SAAF=0.323,0.333,0.354;SAPP=0.026,0.011,0.963;TLOD=147.15    GT:AD:AF:DP:F1R2:F2R1   0/1:73,40:0.356:113:30,21:43,19

command line:

~/gatktools/gatk-4.1.2.0/gatk Mutect2 \
-R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
-I tumor.recalibrated.bam -tumor tumor \
-L chr8:139165272 --interval-padding 250 \
--germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
--tmp-dir tmpdir -O tmp.vcf

Thank you.

Tagged:

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited September 21

    Hi @xiucz

    If I understand correctly, you ran Mutect2 on the "normal" sample in the tumor only mode with a germline-resource and Mutect2 did not filter that variant? Is that correct?

    Also please post the gnomad vcf record of this variant.

  • xiuczxiucz Member

    @bhanuGandham ,

    Yes, I think it should be filtered both in the tumor and normal, please note that the germline-resource is the raw one, not the small one.

    zcat ~/af-only-gnomad.raw.sites.hg19.vcf.gz | grep -w 139165272
    chr8    139165272   rs7835712   T   C   16355619.74 PASS    AC=28043;AF=0.69
    

    Thank you.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited September 24

    Hi @xiucz

    I am a little confused. Would you please clarify my questions:
    1) What is the purpose of running Mutect2 on Normal samples in tumor only mode with a germline-resource? Also you shared the commandline for Mutect2 on tumor sample and not the normal sample.
    2)

    got a variant call in normal sample, not in tumor sample.

    What do you mean by this?
    3) Did you perform any filtration on your data? Take a look at this doc: https://software.broadinstitute.org/gatk/documentation/article?id=24057

    Post edited by bhanuGandham on
  • xiuczxiucz Member

    Hi, @bhanuGandham
    1) Purpose: Sometimes, we don't have a matched normal sample.So I did some test in tumor-only mode, but apparently there were lots of false positive variants in tumor-only mode, thus I added the germline-resource parameter to get germline variants pre-filtered. Then, I used the following commandline:

    tumor sample:
    ~/gatktools/gatk-4.1.2.0/gatk Mutect2 \
    -R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I tumor.recalibrated.bam -tumor tumor \
    -L chr8:139165272 --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --tmp-dir tmpdir -O tumor.tmp.vcf
    
    normal sample:
    ~/gatktools/gatk-4.1.2.0/gatk Mutect2 \
    -R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I normal.recalibrated.bam -tumor normal \
    -L chr8:139165272 --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --tmp-dir tmpdir -O normal.tmp.vcf
    

    2)
    As a consequence,

    $ grep -w 139165272 normal.tmp.vcf
    $ chr8    139165272       .       TG      CA      .       .       DP=115;ECNT=2;MBQ=36,40;MFRL=218,208;MMQ=60,60;MPOS=43;POPAF=7.30;SAAF=0.323,0.333,0.354;SAPP=0.026,0.011,0.963;TLOD=147.15    GT:AD:AF:DP:F1R2:F2R1   0/1:73,40:0.356:113:30,21:43,19
    
    $ grep -w 139165272 tumor.tmp.vcf
    $
    (i got nothing)
    

    3) I did not perform any filtration nextstep yet, because I wanted to know why I failed in calling the site out in my tumor sample first. Should I use GGA mode to call the site ?

    I feel sorry for disturbing you so much and thank you very much.

    Xiucz.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @xiucz

    Sorry I wasn't clear, what is the purpose of running tumor-only mode on Normal Samples?

  • xiuczxiucz Member

    Hi @bhanuGandham ,
    I will put myself in a simple way, let us take my purpose aside.
    Using the same commandline, why the site cannot be pre-filtered in different samples, although this site is recored in germline-resource.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @xiucz

    Sorry I got distracted by what it is that you were trying to achieve. But I see your point. I will look into it and get back to you.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @xiucz

    So we suspect that this might be happening because the variant record you showed us in the normal sample, chr8 139165272 . TG CA is a MNP while gnomad germline resource has only SNPs. This might be why it does not get filtered.
    Try to set max-mnp-distance to 0 and run mutect2 again. Let us know if that resolves the issue.

  • xiuczxiucz Member

    [email protected]

    after setting max-mnp-distance to 0, I still got this site:

    ~/gatktools/gatk-4.1.3.0/gatk Mutect2 \
    -R ~/hg19/gatk_bundle/ucsc.hg19.fasta \
    -I normal.recalibrated.bam -tumor normal \
    -L chr8:139165272 --interval-padding 250 \
    --germline-resource ~/af-only-gnomad.raw.sites.hg19.vcf.gz \
    --tmp-dir tmpdir -O normal.tmp2.vcf --max-mnp-distance 0
    

    the MNV has been split into two SNVs,

    chr8    139165272   .   T   C   .   .   DP=169;ECNT=3;MBQ=20,20;MFRL=213,201;MMQ=60,60;MPOS=45;POPAF=0.161;TLOD=221.95  GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:106,61:0.367:167:43,33:60,28:0|1:139165272_T_C:139165272:48,58,28,33
    chr8    139165273   .   G   A   .   .   DP=169;ECNT=3;MBQ=20,20;MFRL=213,201;MMQ=60,60;MPOS=44;POPAF=0.451;TLOD=221.95  GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:106,61:0.367:167:40,33:63,27:0|1:139165272_T_C:139165272:48,58,28,33
    

    This gnomad germline resource site seems to not be prefiltered still.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @xiucz

    That is weird. After splitting into two SNVs the chr8 139165272 . T C variant should have been filtered. I will check with the dev team and get back to you.

  • 29043594952904359495 Member

    I have another question, if we use --max-mnp-distance 0 in the Mutect2 command, will all variants are snv, and no mnp, there is of course not right I guess.
    @xiucz ,have you checked this, thanks a lot

  • xiuczxiucz Member

    @2904359495 yes, there will be no MNVs.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @xiucz

    Have your run FilterMutectCalls on the results of Mutect2? If not, then you should do that. That will resolve this issue.

  • xiuczxiucz Member
    edited November 4

    :D@bhanuGandham,

    following your insiting advice,I got PASS result still....

    Today, I got another pair samples' result:

    Using MuTect:

    #normal sample tumoronly-mode's mutect result:
    CAGxACG T       C       normal none    0       DBSNP+COSMIC  COVERED  1       1       0       0       0       202     0       -41.248107      10.104493       7.209222        10.383325       0.020101      0.02     10.078254       199     195     4       6319    167     60      60      0       0       TT      -0.693147       0       0       0     00       0       0.777877        0.575503        (95,166,3,4)    56.5    24.5    92.5    24.5    0       possible_contamination  REJECT
    
    #tumor sample tumoronly-mode's mutect result:
    CAGxACG T       C       tumor none    0       DBSNP+COSMIC  COVERED  1       1       0       0       0       146     0       257.594437      257.765154      118.715727      185.997276      0.462069      0.02     4.895241        144     78      67      2468    2692    60      60      0       0       TT      -0.693147       0       0       0     00       0       1       1       (32,64,32,48)   98      23      51      23      0               KEEP
    
    #tumor-normal-mode's mutect result:
    chr9    133738357       rs121913461     T       C       .       REJECT  DB      GT:AD:BQ:DP:FA  0:198,4:.:203:0.020     0/1:78,67:40:145:0.462
    

    Using GATK4.1.3.0 Mutect2:

    #tumor sample tumoronly-mode's result:
    chr9    133738357       rs121913461     T       C       .       PASS  AC=2;AF=0.400;AN=5;CONTQ=93;DB;DP=146;ECNT=1;GERMQ=12;MBQ=32,41;MFRL=234,244;MMQ=60,60;MPOS=45;POPAF=7.30;SAAF=0.424,0.444,0.461;SAPP=0.097,7.159e-03,0.896;SOMATIC;TLOD=243.06;VT=SNP;set=Intersection GT:AD:AF:BQ:DP:F1R2:F2R1:FA:OBAM:OBAMRC:OBQ:OBQRC:SS    0/1:76,65:0.464:.:141:35,31:41,34:.:false:false:60.26:100.00   0/1:78,67:.:40:145:.,.:.,.:0.462:.:.:.:.:2      0:0,0:.:.:0:.,.:.,.:0.00:.:.:.:.:0
    
    #no result in normal
    
    #tumor-normal-mode's result:
    chr9    133738357       .       T       C       .       PASS    CONTQ=93;DP=350;ECNT=1;GERMQ=417;MBQ=32,41;MFRL=238,243;MMQ=60,60;MPOS=45;NALOD=-7.776e+00;NLOD=41.82;POPAF=6.00;SAAF=0.424,0.444,0.461;SAPP=0.097,7.159e-03,0.896;TLOD=243.06        GT:AD:AF:DP:F1R2:F2R1:OBAM:OBAMRC:OBF:OBP:OBQ:OBQRC     0/0:195,4:0.025:199:91,2:104,2:false:false      0/1:76,65:0.464:141:35,31:41,34:false:false:.:.:60.26:100.00
    

    It seemed that in tumor-only mode, only the site have a low VAF that can be prefiltered by the germline resource. If the site have a high VAF, it cannot be prefiltered even in the germline-resource database.

    So, is this a bug i, or a threshold in the prefilter step?

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited November 4

    Hi @xiucz

    Can you please post the exact FilterMutectCalls command you used and please highlight the examples from above that are from the output of FilterMutectCalls.
    Mutect2 does some pre-filter but that is not sufficient and hence we recommend always evaluating the variant calls after executing FilterMutectCalls tool.

Sign In or Register to comment.