vcf file from a pooled sample
Hello,
I am having difficulty in understanding the vcf output of haplotype caller when I use it for a pool of two individuals. I set ploidy to 4 in this case.
When ploidy is 2, one get the heterozygote genotype as 0/1 which is understandable. In the case of ploidy equals to 4, how I can interpret such a genotype (GT:AD:DP:GQ:PL 0/0/0/1:8,3:11:5:95,0,5,24,232) given that this is a pooled sample so it is not possible to know the genotype. I just was wondering what this means.
Thank you,
Homa
Best Answer

Geraldine_VdAuwera Cambridge, MA admin
You have to break it down to the possible genotype combinations. Something like 0/0/0/1 is easy  assuming diploid organisms, this means you have one homref individual and one heterozygote. Where it gets tricky is if you have something like 0/0/1/1, because you don't know if you have two heterozygotes or one homref and one homvar.
Answers
@Homa the ploidy is 2 for each of your samples. You should set the ploidy to 2 (the default) irrespective of whether you have 1, 2 or 1000 samples.
@tommycarstensen , this is a pool of 2 individuals, so, the ploidy should be set to 4.
You have to break it down to the possible genotype combinations. Something like 0/0/0/1 is easy  assuming diploid organisms, this means you have one homref individual and one heterozygote. Where it gets tricky is if you have something like 0/0/1/1, because you don't know if you have two heterozygotes or one homref and one homvar.
Cool. My bad. I misunderstood.