We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK GenotypeGVCFs miss a real deletion

ahdaahda ChinaMember

I used GATK HaplotypeCaller GVCF model to call WES SNV, then I found that a variant in a family had filtered by the step GenotypeGVCFs.
sample1 and sample2 are a family. sample1 is child and sample2 is mother. HaplotypeCaller found a variant at chr4:41747989 both in this two samples,but missed in GenotypeGVCFs step. below is my command.

1) step1
foreach sample: do
/usr/local/gatk-4.1.4.0/gatk --java-options "-Xmx12g" HaplotypeCaller --emit-ref-confidence GVCF -R GRCh37.fasta -I gatk_bqsr/sample1.bqsr.bam -L region.bed -O gatk_haploty/haplotype/sample1.haplo.gvcf.gz -bamout gatk_haploty/haplotype/sample1.haplo.bam
2) step2
/usr/local/gatk-4.1.4.0/gatk --java-options "-Xmx12g" GenotypeGVCFs -R GRCh37.fasta -V gatk_haploty/haplotype/sample1.haplo.gvcf.gz -V gatk_haploty/haplotype/sample2.haplo.gvcf.gz -V gatk_haploty/haplotype/sampleN.haplo.gvcf.gz -O gatk_haploty/haplotype/population.vcf.gz

after step1: the variant records are:
M40270.haplo.gvcf.gz:
chr4 41747989 . AGCTGCCGCCGCTGCC A,<NON_REF> 4370.60 . BaseQRankSum=-0.800;DP=239;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=0.000;RAW_MQandDP=860400,239;ReadPosRankSum=4.127 GT:AD:DP:GQ:PL:SB 0/1:73,110,0:183:99:4378,0,3026,4597,3366,7963:10,63,22,88
M40272.haplo.gvcf.gz:
chr4 41747989 . AGCTGCCGCCGCTGCC A,<NON_REF> 6641.60 . BaseQRankSum=-6.334;DP=301;ExcessHet=3.0103;MLEAC=1,0;MLEAF=0.500,0.00;MQRankSum=-0.643;RAW_MQandDP=1082809,301;ReadPosRankSum=3.609 GT:AD:DP:GQ:PL:SB 0/1:71,167,0:238:99:6649,0,2716,6863,3228,10091:3,68,44,123
then after step2: without the records.

I check the sample1.haplo.bam and sample2.haplo.bam with IGV,as below:

I think this is a very obvious variant. I want to know why GATK filtered this?
Thank you in advance.

Answers

Sign In or Register to comment.