If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.
False positive(?) variant calls of long insertion GATK 3.6
first of all, I'm aware that the calling of long Indels is usually problematic. But since I just came across the issue and could not find any possible explanation for this, I wanted to ask if maybe some of you guys know anything about this observation:
chr3 195507858 . G T,GGTGGATACTGAGGAAGTGTCGGTGACAGGAAGAGGGGTGGCGTGACCGGTGGATGCCGAGGAAGCGTCGGTGACAGGAAGAGGGGTGGTGTCACCTGTGGATACTGAGGAAAAGCTGGTGACAGGAAGAGGGGTGGCGTGACCT 157204 VQSRTrancheINDEL99.00to99.90 AC=0,2;AF=0.237,0.053;AN=2;BaseQRankSum=8.08;ClippingRankSum=0;DP=44310;ExcessHet=34.8579;FS=2.569;InbreedingCoeff=-0.153;MLEAC=63,13;MLEAF=0.24,0.05;MQ=33.48;MQRankSum=-0.033;NEGATIVE_TRAIN_SITE;QD=19.7;ReadPosRankSum=-0.672;SOR=0.898;VQSLOD=-2.182;culprit=MQRankSum GT:AD:DP:GQ:PGT:PID:PL 2/2:60,0,0:60:99:.:.:7071,890,2640,496,194,0
As you can see, its a homozygous genotype 2/2. However, when looking at the AD fields, I cannot detect any evidence for the variant. In contrast, all reads seem to carry the reference allele. Based on this, its completely unclear to me how the caller concludes on this genotype.
In another sample it looks like this:
chr3 195507858 . G GGTGGATACTGAGGAAGTGTCGGTGACAGGAAGAGGGGTGGCGTGACCGGTGGATGCCGAGGAAGCGTCGGTGACAGGAAGAGGGGTGGTGTCACCTGTGGATACTGAGGAAAAGCTGGTGACAGGAAGAGGGGTGGCGTGACCT144784 VQSRTrancheINDEL99.00to99.90 AC=1;AF=0.042;AN=2;BaseQRankSum=3;ClippingRankSum=0;DP=44395;ExcessHet=39.0155;FS=2.58;InbreedingCoeff=-0.1729;MLEAC=10;MLEAF=0.038;MQ=33.27;MQRankSum=-0.746;NEGATIVE_TRAIN_SITE;QD=17.95;ReadPosRankSum=-0.672;SOR=0.897;VQSLOD=-2.312;culprit=MQRankSum GT:AD:DP:GQ:PGT:PID:PL 0/1:89,0:92:99:.:.:4731,0,2164
Same issue, but now its suddenly heterozygous...
And in this sample:
chr3 195507858 . G GGTGGATACTGAGGAAGTGTCGGTGACAGGAAGAGGGGTGGCGTGACCGGTGGATGCCGAGGAAGCGTCGGTGACAGGAAGAGGGGTGGTGTCACCTGTGGATACTGAGGAAAAGCTGGTGACAGGAAGAGGGGTGGCGTGACCT 156872 PASS AC=2;AF=0.061;AN=2;BaseQRankSum=0.967;ClippingRankSum=0;DP=42059;ExcessHet=33.8845;FS=3.403;InbreedingCoeff=-0.1488;MLEAC=14;MLEAF=0.053;MQ=34.6;MQRankSum=-1.012;QD=19.08;ReadPosRankSum=-0.692;SOR=0.827;VQSLOD=-0.3066;culprit=MQ GT:AD:DP:GQ:PGT:PID:PL 1/1:39,27:66:99:.:.:6872,496,0
Although there is (in my opinion) evidence for a 0/1 genotype, its called as 1/1.
Would be great if someone could share his/her opinion on that. Maybe I'm just not up to date, but I think these specific genotypes really make no sense..