This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Force skip INDELs in GATK output
Dear GATK, I ma using GATK4, my syntax is:
gatk HaplotypeCaller -R $reference -I in.bam -ERC BP_RESOLUTION --max-alternate-alleles 1 -O out.raw.snps.g.vcf -L $SNP_TARGET
In some regions i have curious output (also compare with another variant callers + IGV).
Result from GATK in a few samples is:
SNP target is:
chr1 152145740 152145741 rs12144907
chr1 152145741 . G GGAGAGC,<NON_REF> 3985.64 . BaseQRankSum=-1.051;DP=317;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-0.789;RAW_MQandDP=1131778,317;ReadPosRankSum=1.158 GT:AD:DP:GQ:PL:SB0/1:138,172,0:310:99:3993,0,3067,4406,3603,8009:76,62,98,74
But in IGV I can see genotype GC (total count is 406, C is 208 and G is 197). I can see a few insert bases (called by GATK output GAGAGC), but only maybe in 3 reads (note: I do not downsampling BAM in IGV).
Result from bcftools is:
chr1 152145741 . G C,A,<*> 0 . DP=381;I16=102,28,110,23,5898,308852,6254,334182,7562,446018,7880,470048,2518,56072,2784,64140;QS=0.48217,0.514629,0.00320142,0;VDB=2.97088e-34;SGB=-0.693147;RPB=0.0605761;MQB=0.804333;MQSB=0.764982;BQB=0.578277;MQ0F=0 PL 255,0,255,255,255,255,255,255,255,255
Which seems to be correct to my observation in IGV. Note - I need parameter -ERC BP_RESOLUTION.
Any idea how to set up GATK to avoid this weird SNP calling in this region (and few others)? Maybe force skip insertion?
Thank you for any help and reply!