The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at

should ploidy setting of pooled sequence data depend on depth?

marakatmarakat Member Posts: 5


I have three pools of hiseq data where N=20,20, & 40 individuals per pool. I sequenced to ~10x depth for each pool. I would like to use the ploidy setting to estimate probable genotypes from each pool, but I'm torn because it doesn't seem correct to estimate 20 or 40 genotypes from pools that have only been sequenced to 10x depth (only 10 chromosomes could have been sampled).
So in such a case, would it be more advisable to set the ploidy level to the depth level of 10 and estimate 5 genotypes per pool?

Thank you very much!



  • delangeldelangel Dev Posts: 71

    the fundamental problem is that you don't have enough coverage to reliably estimate allelic fractions correctly in your pools - if you have 10x and 20 individuals you only have 0.5x per individual and you'll have no power to detect low frequency variation in your pools. In theory you should set your ploidy to 20 or 40 so that at least you have mathematically accurate measurements of GQ and QUAL but you'll only get sensible results for common variants present in a large fraction of your individuals in the pool

Sign In or Register to comment.