To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Detecting SNV in human populations

Hello Geraldine,

First thank you a lot for your amazing work on this forum. My project deals with discovering rare population-specific variants in human exomes, and I would like to know how the VQSR step would affect the discovery of these variants. I was wondering whether it is better to perform VQSR on all the populations together (420 individuals but with a risk to clean out "true" rare population-specific variants ) or to run it by population (between 30 and 100 individuals each but I read that VQSR is loosing power with a reduced number of samples) ?

Thank you for your help,


Best Answer


  • tommycarstensentommycarstensen United KingdomMember

    @Lopez_Marie I had a very similar case with 420 samples distributed across 4 populations. I did various combinations. I found that calling and filtering the populations together yielded a slightly better result in terms of sensitivity and specificity. I agree with the recommendation from @Geraldine_VdAuwera.

  • Many thanks to both of you for your quick support ! Have a great day.

Sign In or Register to comment.