We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

A problem about FILTER flag "alignment": filtered some TP variant(sanger varified)

ahdaahda ChinaMember

I test my GATK4.1.0.2 mutect2 best practice flow with gastric cancer wes pair data(Normal-N990005/Tumor-T990005) from the paper(2012,Nature Genetics,PMID:22484628) .
I download the data from here:
https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR504685#
https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR504686#

then this paper give 8 SNV(actully 9,filtered 1 ) results and 2 indels (actully 3,filtered 1),which are all verified by sanger.
they do bwa alignment by hg18,so I convert to hg19.
I find 2 of this records are filtered by "alignment":
chr15 79056976 . G A . alignment CONTQ=93;DP=59;ECNT=1;GERMQ=93;MBQ=32,33;MFRL=153,141;MMQ=60,60;MPOS=4;NALOD=1.60;NLOD=11.03;POPAF=6.00;RCNTS=0,4;ROQ=45;SEQQ=76;STRANDQ=72;TLOD=14.19 GT:AD:AF:DP:F1R2:F2R1:SB 0/0:37,0:0.025:37:16,0:19,0:18,19,0,0 0/1:11,7:0.394:18:5,3:5,3:5,6,3,4
chrX 101971433 . C T . alignment CONTQ=93;DP=545;ECNT=2;GERMQ=93;MBQ=30,33;MFRL=121,148;MMQ=60,60;MPOS=18;NALOD=2.41;NLOD=75.78;POPAF=3.19;RCNTS=0,4;ROQ=90;SEQQ=93;STRANDQ=93;TLOD=112.62 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|0:252,0:3.892e-03:252:130,0:107,0:0|1:101971433_C_T:101971433:126,126,0,0 0|1:236,41:0.150:277:127,24:100,16:0|1:101971433_C_T:101971433:137,99,24,17

My filterAlignmentArtifact command is as below:
gatk FilterAlignmentArtifacts -V gatk_mutect2/S086/S086.m2_oncefilt.vcf.gz -I gatk_mutect2/S086/S086.m2.sort.bam --bwa-mem-index-image gatk_db_hg38/Homo_sapiens_assembly38.fasta.img -O gatk_mutect2/S086/S086.m2_twicefilt.vcf.gz
and my whole mutect2 flow can find here:
https://gatkforums.broadinstitute.org/gatk/discussion/24237/a-problem-about-strand-bias-filter-in-mutect2#latest

So is this the BUG of filterAlignmentArtifact , or the expected sacrifice in order to drop FP variant?
I'm very appreciate if anyone can give me some advice.Thanks a lot.

by the way. I compare the result of this paper reported at 2012 and my GATK4.1 flow:

I check 119 of GATK4.1 unique variant,average AF is 10% and average depth is 210X, which seems very good.
So I think perhaps these variants are due to hg18 reference,is it possible?

Best Answer

Answers

Sign In or Register to comment.