GATK4 - CreateSomaticPanelOfNormals. What's under the hood ?
I am using GATK4 (4.0.12) to create a panel of normals to be used with Mutect2.
As suggested, I have first created a VCF for each normal BAM that I have with Mutect2 in tumor-only mode. Then I use CreateSomaticPanelOfNormals to aggregate all these and get a PON. The only parameter you can play with at this step is
But when I look at one of the single normal sample VCFs I see a lot of positions with a very low read depth support (a lot have DP=1 actually). Here follows a example :
chr1 866940 . G A . . DP=134;ECNT=2;MBQ=10,33;MFRL=218,201;MMQ=35,49;MPOS=29;POPAF=0.220;TLOD=371.15 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:2,115:0.991:117:1,54:1,61:0.980,0.980,0.983:0.024,0.029,0.946 chr1 872261 . C A . . DP=1;ECNT=1;MBQ=0,34;MFRL=0,209;MMQ=0,60;MPOS=8;POPAF=5.40;TLOD=3.58 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,1:0.667:1:0,1:0,0:0.990,0.990,1.00:0.025,0.028,0.947 chr1 890446 . C CA . . DP=2;ECNT=1;MBQ=0,31;MFRL=0,144;MMQ=0,60;MPOS=24;POPAF=0.243;RPA=11,12;RU=A;STR;TLOD=4.88 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,2:0.750:2:0,0:0,2:0.990,0.990,1.00:0.027,0.027,0.947 chr1 893093 . T A . . DP=1;ECNT=1;MBQ=0,37;MFRL=0,178;MMQ=0,60;MPOS=39;POPAF=5.40;TLOD=3.88 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,1:0.667:1:0,0:0,1:0.990,0.990,1.00:0.025,0.028,0.947 chr1 900096 . G A . . DP=1;ECNT=1;MBQ=0,31;MFRL=0,156;MMQ=0,60;MPOS=34;POPAF=5.40;TLOD=3.28 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,1:0.667:1:0,0:0,1:0.990,0.990,1.00:0.028,0.025,0.947 chr1 916377 . A G . . DP=2;ECNT=1;MBQ=0,28;MFRL=0,232;MMQ=0,60;MPOS=38;POPAF=3.926e-03;TLOD=5.98 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,2:0.750:2:0,1:0,1:0.990,0.990,1.00:0.029,0.025,0.946 chr1 934937 . G A . . DP=1;ECNT=1;MBQ=0,34;MFRL=0,153;MMQ=0,60;MPOS=11;POPAF=0.385;TLOD=3.58 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,1:0.667:1:0,1:0,0:0.990,0.990,1.00:0.028,0.025,0.947 chr1 951408 . G A . . DP=5;ECNT=2;MBQ=0,37;MFRL=0,264;MMQ=0,60;MPOS=32;POPAF=0.102;TLOD=20.31 GT:AD:AF:DP:F1R2:F2R1:SAAF:SAPP 0/1:0,5:0.857:5:0,2:0,3:0.990,0.990,1.00:0.025,0.030,0.945
Basically, I was wondering how are handled these positions with DP=1 (or say DP<10) in the merge process ?
- Are they removed/filtered out ?
- If two distinct normal samples share a same position, both with DP=1, will we see it in the PON ? Another way to say it is, do you consider all these positions in the global count regardless of DP, TLOD, ... fields ?