Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

Regards
GATK Staff

Why does GATK LeftAlignAndTrimVariants set a missing genotype to 0/0?

TottiTotti 日本Member
edited October 2016 in Ask the GATK team

Hi. I appreciate many your helps.

I have one vcf file (a.vcf). This file has one variant data. The data also has missing genotypes "./." because of DP=0. The variant is tri-allelic variant as below.

"a.vcf"

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T,A 147880 PASS . GT:AD:DP:GQ ./.:0,0:0 0/1:25,22,0:47:99 0/0:36,0,0:36:99 ./.:0,0:0

I want to split the tri-allelic data into bi-allelic data, so I did the below command using GATK.

java -jar GenomeAnalysisTK.jar \
-T LeftAlignAndTrimVariants \
-R ${ref_path} \
--variant a.vcf \
-o b.vcf \
--splitMultiallelics \
--reference_window_stop 900

As a result, I got b.vcf. "b.vcf"

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT sample1 sample131 sample138 sample908
chr12 104350956 . G T 147880 PASS AC=1;AF=0.250;AN=4 GT:AD:DP:GQ 0/0:.:0 0/1:25,22:47:99 0/0:36,0:36:99 0/0:.:0
chr12 104350956 . G A 147880 PASS AC=0;AF=0.00;AN=4 GT:AD:DP:GQ 0/0:.:0 0/0:25,0:47:99 0/0:36,0:36:99 0/0:.:0

In the b.vcf, thw splited two variants were bi-allelic data, but the missing genotypes were set to "0/0". I want to remain the missing genotype after the process of GATK.

How should I process the file?

GATK's version is 3.6.

Answers

Sign In or Register to comment.