To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

HaplotypeCaller in GVCF mode for trio

emixaMemixaM CanadaMember
edited February 2017 in Ask the GATK team

Hello!

I am setting up a GATK pipeline for analysis of trio. I have run all the step perfectly up to HaplotypeCaller, then GenotypeGVCFs, and finally PhaseByTransmission walkers.

However, by manually checking a few variants, I am puzzled by the fact that there are variant bases in the three family members, but only the father sees its variants called.

On the attached screenshot, we can see the variants (41/23 for the mother, 22/15 for the daughter, 22/18 for the father).

However in the g.vcf files for the position :

alignment/NA12878_daughter/rawHaplotypeCaller/NA12878_daughter.others.hc.g.vcf:1 201178788 . A .. END=201178788 GT:DP:GQ:MIN_DP:PL 0/0:28:0:28:0,0,449
alignment/NA12891_father/rawHaplotypeCaller/NA12891_father.others.hc.g.vcf:1 201178788 . A G, 173.77. BaseQRankSum=-1.949;ClippingRankSum=1.169;DP=34;MLEAC=1,0;MLEAF=0.500,0.00;MQ=56.41;MQ0=0;MQRankSum=-1.772;ReadPosRankSum=-1.701 GT:AD:DP:GQ:PL:SB 0/1:21,13,0:34:99:202,0,560,262,596,858:10,11,2,11
alignment/NA12892_mother/rawHaplotypeCaller/NA12892_mother.others.hc.g.vcf:1 201178788 . A . .END=201178788 GT:DP:GQ:MIN_DP:PL 0/0:49:0:49:0,0,1040

And in the joint geno file :

1 201178788 . A G 176.93 . AC=1;AF=0.167;AN=6;BaseQRankSum=-1.949e+00;ClippingRankSum=1.17;DP=111;FS=11.252;GQ_MEAN=67.33;GQ_STDDEV=116.62;MLEAC=2;MLEAF=0.333;MQ=56.41;MQ0=0;MQRankSum=-1.772e+00;NCC=0;QD=5.20;ReadPosRankSum=-1.701e+00 GT:AD:DP:GQ:PL 0/0:28,0:28:0:0,0,449 0/1:21,13:34:99:202,0,560 0/0:49,0:49:0:0,0,1040

And in the PhaseByTransmission file :

1 201178788 . A G 176.93 . AC=1;AF=0.167;AN=6;BaseQRankSum=-1.949e+00;ClippingRankSum=1.17;DP=111;FS=11.252;GQ_MEAN=67.33;GQ_STDDEV=116.62;MLEAC=2;MLEAF=0.333;MQ=56.41;MQ0=0;MQRankSum=-1.772e+00;NCC=0;QD=5.20;ReadPosRankSum=-1.701e+00 GT:AD:DP:GQ:PL:TP 0|0:28,0:28:0:0,0,449:1 0|1:21,13:34:99:202,0,560:1 0|1:49,0:49:0:0,0,1040:1

I know that between tools, I should not expect perfect overlap on the depth of each allele, and for me it is close enough for the father (IGV: 22/18, HC: 21/13).

However the mother and the daughter look like they are 49/0 and 28/0 (for IGV, they are 41/23 and 22/15).

Could it be that reads supporting the alt allele have bad qualities (to the point of having 0 alt in the end)?

Many thanks in advance!

Best Answer

Answers

Sign In or Register to comment.