This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Single sample or joint calling in experiment with varying sample sizes
I have some genome sequences from 3 closely related non-model species (same genus), and i want to do snp calling and then look at site frequency spectrum, genetic diversity etc .... However, the sample sizes are quite variable (30 vs 10 vs 8). Moreover, the species with 30 is also the species for which the genome is available. Consequently i'm a bit worried about doing joint calling, because singletons are typically undercalled, and i imagine variants will be better detected in the species with 30 samples. Would you agree single sample calling in haplotype caller was probably the best approach here? Someone suggested to me joint calling across all the species, but that seems like a bad idea to me, and i have no idea what biases that would introduce.