If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.
Problems interpreting Mutect2 output
I am calling variants with Mutect2 (default parameters) from bulk WGS Tumor/Normal pairs following Somatic SNV Best Practices, and in the VCF outputs I am finding a lot of variants like this (the last one is the tumor):
chr1 1037759 . CTT C . PASS ECNT=1;HCNT=1;MAX_ED=.;MIN_ED=.;NLOD=4.60;RPA=12,10;RU=T;STR;TLOD=26.17 GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:31,0:NaN:0:0:0,0:0:0 0/1:13,21:1.00:8:13:0,611:0:0
The genotype suggested for the tumor is heterozygous. However, the AF is 1.00. I also see that the QSS for reference allele is 0, but with IGV I checked that the base and mapping qualities at this position are normal for both reference and alternative-supporting reads and that they are primary alignments and have their mates mapped. I got up to 14% AF=1.00, which seems very weird to me for this type of analysis.
It doesn't happen for all the deletions, though:
chr1 1128849 . CTT C . PASS ECNT=1;HCNT=2;MAX_ED=.;MIN_ED=.;NLOD=3.77;RPA=11,9;RU=T;STR;TLOD=20.27 GT:AD:AF:ALT_F1R2:ALT_F2R1:QSS:REF_F1R2:REF_F2R1 0/0:24,0:0.00:0:0:60,0:0:2 0/1:20,13:0.867:6:7:60,403:2:0
In this case QSS is not 0, and I feel that it could have some relationship ( (13403)/((2060)+(13*403))=0.814, near to the AF).
Most of the cases are indels, but not all of them:
chr2 28505882 . A G . PASS ECNT=1;HCNT=8;MAX_ED=.;MIN_ED=.;NLOD=4.00;TLOD=23.46 GT:AD:AF:ALT_F1R2:ALT_F2R1:FOXOG:QSS:REF_F1R2:REF_F2R1 0/0:16,0:0.00:0:0:.:149,0:4:1 0/1:2,15:1.00:1:0:0.00:0,30:0:0
So in summary, I don't understand the way in which AF is calculated. Am I misunderstanding the AF concept or the way it works for this type of samples? Or may I be skipping the reason of QSS=0? Should I use AD for my calculations instead?
Thank you very much!
GATK version: 3.7-0-gcfedb67
Java version: 1.8.0_31
WGS paired end samples
Bulk Tumor/Normal pairs
Sequenced with HiSeqX using TruSeq Nano DNA (350) library kit