Why does GATK LeftAlignAndTrimVariants set a missing genotype to 0/0?

TottiTotti 日本Member
edited October 2016 in Ask the GATK team

Hi. I appreciate many your helps.

I have one vcf file (a.vcf). This file has one variant data. The data also has missing genotypes "./." because of DP=0. The variant is tri-allelic variant as below.

"a.vcf"

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T,A 147880 PASS . GT:AD:DP:GQ ./.:0,0:0 0/1:25,22,0:47:99 0/0:36,0,0:36:99 ./.:0,0:0

I want to split the tri-allelic data into bi-allelic data, so I did the below command using GATK.

java -jar GenomeAnalysisTK.jar \
-T LeftAlignAndTrimVariants \
-R ${ref_path} \
--variant a.vcf \
-o b.vcf \
--splitMultiallelics \
--reference_window_stop 900

As a result, I got b.vcf. "b.vcf"

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T 147880 PASS AC=1;AF=0.250;AN=4 GT:AD:DP:GQ 0/0:.:0 0/1:25,22:47:99 0/0:36,0:36:99 0/0:.:0
chr12 104350956 . G A 147880 PASS AC=0;AF=0.00;AN=4 GT:AD:DP:GQ 0/0:.:0 0/0:25,0:47:99 0/0:36,0:36:99 0/0:.:0

In the b.vcf, thw splited two variants were bi-allelic data, but the missing genotypes were set to "0/0". I want to remain the missing genotype after the process of GATK.

How should I process the file?

GATK's version is 3.6.

Answers

Sign In or Register to comment.