How does MuTect2 assign germline_risk filter

Zhenyu_ZhangZhenyu_Zhang University of ChicagoMember

I have read MuTect1 paper and MuTect2 code, and it seems the germline risk is assigned this way
1. if variants in dbsnp, but not cosmic, the nlod cutoff is 5.5
2. otherwise, the cutoff is 2.2

Some of the variants that looks like somatic, but are labelled as germline_risks, one example is
chr11 123456789 . C A . germline_risk ECNT=1;HCNT=4;MAX_ED=.;MIN_ED=.;NLOD=3.61;TLOD=12.81 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:13,0:0.00:0:0:.:440,0:10:3 0/1:10,6:0.357:2:4:0.667:336,173:4:6

And I used the MuTect2 default settings initial_tumor_lod=4.0 initial_normal_lod=0.5 tumor_lod=
6.3 normal_lod=2.2 dbsnp_normal_lod=5.5

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    You are correct in your interpretation of the code. Was that your question or is there something that troubles you?

  • Zhenyu_ZhangZhenyu_Zhang University of ChicagoMember

    My question is that since this is a non dbSNP variants, with NLOD = 3.61 (that passed 2.2 cutoff), why MuTect2 labeled this variant as "germline_risk"?

    Issue · Github
    by shlee

    Issue Number
    1761
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    sooheelee
  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @Zhenyu_Zhang,

    Would you mind providing us a subset of your data to recapitulate your observation? We'd like to investigate further to figure out what is going on. You can follow the instructions for filling in a bug report in Article#1894 to submit your data and tell us in this thread the file name. Please be sure to provide snippets of all the resource files you use (dbsnp, cosmic, pon). Here are two command templates to make the data small:

    For VCFs

    java -jar $GATK \
       -T SelectVariants \
       -R ~/Documents/ref/hg38/Homo_sapiens_assembly38.fasta \
       -V dbSNP142_GRCh38_subset50k.vcf.gz \
       -o ../odd_germlinerisk_filtering/dbSNP142.vcf.gz \
       -L chr11:66295655 \
       -ip 1000
    

    For alignments

    java -jar $GATK -T PrintReads \
        -R ~/Documents/ref/hg38/Homo_sapiens_assembly38.fasta \
        -I hcc1143_T_subset50K.bam \
        -o ../odd_germlinerisk_filtering/tumor.bam \
        -L chr11:66295655 \
        -ip 1000
    

    Thanks.

  • shleeshlee CambridgeMember, Administrator, Broadie, Moderator admin

    Hi @Zhenyu_Zhang,

    I'm just following up with your strange case. We tested the --normal_lod cutoff with some of our own test data for a site without any records in dbSNP/COSMIC/PoN but with a germline_risk filter. We find that lowering the NLOD allows the site to pass filters. So we cannot recapitulate your observation. Can you think of anything else about your analysis that could be a factor in your site not passing? If you could share your exact command and a snippet of the data, then we would be able to investigate this further. Thanks.

Sign In or Register to comment.