If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

MPOS is -2147483648 in Mutect2 GATK

amjaddamjadd FinlandMember

I am seeing that for many variants detected by Mutect2, the MPOS (median distance from end of read) is negative, and sometimes the negative value is huge. What does that mean?


Best Answer


  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @amjadd

    Can you please post a few example vcf records for these negative MPOS values.

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    Looks like an integer overflow problem.

  • amjaddamjadd FinlandMember

    @bhanuGandham Here are some records

    chr12   49415383        .       T       A       .       PASS    CONTQ=93;DP=177;ECNT=5;GERMQ=93;MBQ=32,31;MFRL=318,194;MMQ=60,60;MPOS=-2147483648;NALOD=1.37;NLOD=13.55;POPAF=6.00;SEQQ=81;STRANDQ=3;TLOD=12.18 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB     0|1:6,5:0.462:11:2,1:3,4:0|1:49415383_T_A:49415383:5,1,5,0      0|0:45,0:0.021:45:20,0:23,0:0|1:49415383_T_A:49415383:41,4,0,0  0|1:28,0:0.033:28:14,0:14,0:0|1:49415383_T_A:49415383:28,0,0,0  0|1:54,0:0.018:54:26,0:24,0:0|1:49415383_T_A:49415383:54,0,0,0  0|1:38,0:0.025:38:16,0:19,0:0|1:49415383_T_A:49415383:37,1,0,0
    chr3    50289416    .   T   TGAGATGGAG  .   PASS    CONTQ=93;DP=1271;ECNT=3;GERMQ=93;MBQ=36,35;MFRL=179,286;MMQ=60,60;MPOS=-2147483648;NALOD=2.14;NLOD=41.24;POPAF=6.00;SEQQ=41;STRANDQ=93;TLOD=8.53    GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|0:137,0:7.172e-03:137:70,0:66,0:0|1:50289401_GCCTCCATCT_G:50289401:134,3,0,0  0|1:545,4:9.034e-03:549:268,1:268,3:0|1:50289401_GCCTCCATCT_G:50289401:539,6,4,0    0|1:284,1:6.937e-03:285:139,0:145,1:0|1:50289401_GCCTCCATCT_G:50289401:281,3,1,0    0|1:285,1:6.907e-03:286:134,0:150,1:0|1:50289401_GCCTCCATCT_G:50289401:283,2,1,0
    chr6    134494626       .       C       CTTCTTGAAAGTGATCGGAAAGGGCAGTTTTGGAAAGGTAA       .       PASS    CONTQ=93;DP=2425;ECNT=1;GERMQ=93;MBQ=20,28;MFRL=119,183;MMQ=60,60;MPOS=-1073741824;NALOD=2.50;NLOD=185.19;POPAF=6.00;SEQQ=14;STRANDQ=93;TLOD=7.56       GT:AD:AF:DP:F1R2:F2R1:SB        0/1:1685,6:4.069e-03:1691:818,1:857,5:872,813,6,0       0/0:613,0:1.593e-03:613:304,0:306,0:312,301,0,0
    chr12   113515951       .       G       A       .       PASS    CONTQ=93;DP=1368;ECNT=2;GERMQ=93;MBQ=20,29;MFRL=138,93;MMQ=60,60;MPOS=-1;NALOD=2.42;NLOD=78.10;POPAF=6.00;SEQQ=29;STRANDQ=3;TLOD=5.69   GT:AD:AF:DP:F1R2:F2R1:SB        0/1:1040,15:0.013:1055:529,8:462,7:380,660,9,6  0/0:260,0:3.771e-03:260:138,0:111,0:67,193,0,0
    chr11   108163661       .       T       C       .       PASS    CONTQ=93;DP=293;ECNT=4;GERMQ=93;MBQ=33,36;MFRL=317,262;MMQ=60,60;MPOS=-6;NALOD=1.91;NLOD=23.48;POPAF=6.00;SEQQ=73;STRANDQ=3;TLOD=11.62  GT:AD:AF:DP:F1R2:F2R1:SB        0/1:25,6:0.211:31:14,1:11,5:0,25,0,6    0/0:78,0:0.012:78:37,0:39,0:4,74,0,0    0/1:44,0:0.022:44:23,0:21,0:2,42,0,0    0/1:69,0:0.014:69:32,0:37,0:0,69,0,0    0/1:52,0:0.018:52:30,0:22,0:3,49,0,0
    chr14   106241897       .       C       T       .       PASS    CONTQ=93;DP=5348;ECNT=4;GERMQ=93;MBQ=34,20;MFRL=160,139;MMQ=60,60;MPOS=-2147483648;NALOD=2.86;NLOD=215.34;POPAF=6.00;SEQQ=93;STRANDQ=93;TLOD=4515.57    GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB     0|1:1323,1157:0.467:2480:594,549:633,544:0|1:106241884_GTGGGGGTC_G:106241884:602,721,565,592    0|0:716,0:1.379e-03:716:344,0:333,0:0|1:106241884_GTGGGGGTC_G:106241884:322,394,0,0     0|1:745,136:0.155:881:362,78:375,56:0|1:106241884_GTGGGGGTC_G:106241884:475,270,61,75   0|1:920,1:1.081e-03:921:429,0:482,0:0|1:106241884_GTGGGGGTC_G:106241884:546,374,0,1     0|1:169,0:5.752e-03:169:90,0:79,0:0|1:106241884_GTGGGGGTC_G:106241884:106,63,0,0
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    HI @amjadd

    That does look weird and MPOS values should not be negative. Can you please post the version of GATK you are using and the exact Mutect2 command you are using. This will help us narrow down the possible causes.

  • amjaddamjadd FinlandMember

    Hi @bhanuGandham
    It was GATK Here is the command:

    gatk4 Mutect2 -R $ref -I $bamNormal -I $bamTumor1 -I $bamTumor2 -O $vcfOut \
    --max-reads-per-alignment-start 0 --pcr-indel-model HOSTILE \
    --bam-output $bamOut  --germline-resource $gnomad --panel-of-normals $pon \
     -L $intervals -ip 300 --f1r2-tar-gz $f1r2Out -normal $normalName
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited September 4

    Hi @amjadd

    As mentioned in the tool docs for Mutect2,

    As of v4.1 Mutect2 supports joint calling of multiple tumor and normal samples from the same individual. The only difference is that -I and -normal must be specified for the extra samples.

    In your commandline, looks like you are only providing tumor normal pair for one sample but not for the other. This might be the source of the issue. Can you please fix this and try again.

  • amjaddamjadd FinlandMember

    Hi @bhanuGandham

    Sorry I don't think you are right here. You specify as many input bams with -I $bam and then you specify which ones are normal with -normal $normalSampleName In my case I have only one normal and multiple tumors.

    The statement you listed is basically making the distinction from Mutect2 before 4.1, when we had to specify -normal $normalName and -tumor $tumorName for every input bam.

  • amjaddamjadd FinlandMember

    @bhanuGandham see for example a case of 3 tumors and 2 normals from

    @davidben said:
    @FPBarthel @cbao As of yesterday's GATK 4.1 release Mutect2 now has this multi-sample mode. You can input an arbitrary number of tumor and normal samples as follows:

    gatk  Mutect2 -R ref.fasta -pon $pon -O calls.vcf \
       -I tumor1.bam -I tumor2.bam -I tumor3.bam \
       -I normal1.bam -I normal2.bam \
       -normal normal1_sample_name -normal normal2_sample_name
    gatk FilterMutectCalls -V calls.vcf -O filtered.vcf

    Note that you now only specify the sample names of the normals. The output is still a multi-sample vcf.

    As this is a new feature we are very interested in user feedback.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin


    You might be right. I apologize for that. I will need to dig into this more and get back to you.

  • 29043594952904359495 Member

    @amjadd,I see that you add one argument --max-reads-per-alignment-start,does this affect your final result? thanks a lot

  • amjaddamjadd FinlandMember

    @davidben Thank you for the answer. I'll trust those variants for now then.

  • amjaddamjadd FinlandMember

    @2904359495 yes --max-reads-per-alignment-start has a big effect on my results. Basically it disables downsampling of reads. It is useful when you have targeted sequencing with ultra high depth.

  • 29043594952904359495 Member

    @amjadd , can you describe the big effect, thanks a lot.
    more PASS varintas or something else?

  • 29043594952904359495 Member
    edited September 6

    so for better result, I should add the --max-reads-per-alignment-start 0 all th time ,
    because long time ago ,also someone say this question, but someone said there would be some side effects(except for time consumption), so I want to confirm with davidben @davidben
    thanks a lot.

    Post edited by 2904359495 on
  • 29043594952904359495 Member
    edited September 11

    @davidben can you comment on this? whether --max-reads-per-alignment-start 0 has some bad effects on the final result, except for time consumption?
    thanks a lot

Sign In or Register to comment.