Is UnifiedGenotyper actually better than HaplotypeCaller for this pooled sample project?

Hi, I am interested in calling variants from pooled samples. Specifically, I wish to determine SNP allele frequencies from samples that were made by pooling many individuals (1000+) together. I know that HaplotypeCaller is now recommended over UnifiedGenotyper in all cases. However, is this project an exception? I have:

  • 1000s of individuals in each pooled sample
  • only two possible alleles at every site
  • I only need to call SNPs
  • I can generate a set of known SNPs to call (does GENOTYPE_GIVEN_ALLELES work in HaplotypeCaller?)
  • I have high read coverage
  • I want to detect rare alleles as best as possible

If you still advise using HaplotypeCaller in this case, do you have any special suggestions? I'd like to maximize the -ploidy number to detect the rare alleles, but otherwise streamline the job. Thanks for any advice you can provide!

Sign In or Register to comment.