Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How to explain F1R2 plus F2R1 NOT equal AD in Mutect2 vcf ?

we are using MuTect2 to call somatic SNVs and indels (gatk v 4.1.2.0). we find error in F1R2 plus F2R1 not equal AD. The follow is our command:
gatk Mutect2 -R /home/MuTect2/genome/hg19.fa \
-I /home/MuTect2/DATA/01_BQSR/LCT37.sorted.dedup.recal.bam \
-I /home/MuTect2/DATA/01_BQSR/LPT37.sorted.dedup.recal.bam -normal LPT37 \
--germline-resource /home/MuTect2/group_37_result/0_known_sites/af-only-gnomad.raw.sites.hg19.vcf \
--panel-of-normals /home/MuTect2/DATA/02_Mutect/PON/PON1/37.pon1.vcf.gz \
--f1r2-tar-gz LPT37_LCT37.f1r2.tar.gz --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter \
--bam-output LPT37_LCT37.bam -O Pon1.LPT37.N-LCT37.T.unfilter.somatic.vcf

the Mutect2 vcf is
chr1 13613 . T A . . DP=797;ECNT=1;MBQ=35,35;MFRL=264,316;MMQ=25,24;MPOS=65;NALOD=1.56;NLOD=112.50;POPAF=1.70;TLOD=4.49 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:386,8:0.022:394:192,4:192,4:67,319,7,1 0/0:392,2:6.578e-03:394:187,1:202,1:59,333,2,0
for LCT37:

ref_AD=386
alt_AD=8

ref_F1R2=192;ref_F2R1=192
alt_F1R2=4,alt_F2R1=4
ref_ F1R2 + ref_F2R1=384
not equal ref_AD (351)?

Answers

  • TiaTia Member
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Tia

    Can you please post the header for this vcf file which describes the F2R1 and F1R2 annotations.

  • TiaTia Member
    ok ,the header annotations for F2R1 and F1R2 is follwing:
    ##FORMAT=<ID=F1R2,Number=R,Type=Integer,Description="Count of reads in F1R2 pair orientation supporting each allele">
    ##FORMAT=<ID=F2R1,Number=R,Type=Integer,Description="Count of reads in F2R1 pair orientation supporting each allele">
  • TiaTia Member
    ok ,the header annotations for F2R1 and F1R2 is follwing:
    ##FORMAT=<ID=F1R2,Number=R,Type=Integer,Description="Count of reads in F1R2 pair orientation supporting each allele">
    ##FORMAT=<ID=F2R1,Number=R,Type=Integer,Description="Count of reads in F2R1 pair orientation supporting each allele">
  • TiaTia Member
    ok ,the header annotations for F2R1 and F1R2 is follwing
    F1R2,Number=R,Type=Interage,Description="Count of reads in F1R2pair orientation supporting each allele"
    F2R1,Number=R,Type=Integer,Description="Count of reads in F2R1 pair orientation supporting each allele"
  • xiuczxiucz Member

    hi @bhanuGandham,
    Could you please take a look at this? Thank you very much.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Tia

    We are looking into this and will get back to you shortly.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Tia

    I checked with Mutect2(M2) developer and he said: Since F1R2 is a Mutect2-only annotation it's smarter about paired reads than AD, which is a HaplotypeCaller (and Mutect2) annotation. The difference occurs because in Mutect2 support for a variant is accounted for per read pair, not per read. Thus a read that doesn't overlap a variant is considered uninformative by HC and in AD, but if it's mate is informative M2 may treat both reads as variant-supporting and thus they are accounted for by F1R2 and other M2 annotations.

  • TiaTia Member
    Hi
    In my opinion , a read will not included in AD which doesn't overlap a variant , but it will included in F1R2 or F2R1. Because it’s mate must be informative . So the number of AD is smaller than F1R2 plus F2R1. But we always observed the number of AD is larger than F1R2 plus F2R1 in fact. Did I misunderstand something ? :)
Sign In or Register to comment.