We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Alternative resources for Mutect2/GetPileupSummaries when dealing with genome data


I'm currently using GATK, following the best practices for somatic variant calling. I already have this set up for exomes, but I'm now attempting to run the same pipeline on genome data.

I'm using the gnomAD "af-only-gnomad.raw.sites.b37.vcf" file as the germline resource for Mutect2, and I'm using the "small_exac_common_3_b37.vcf.gz" file (both from the bundle) as the -V and -L arguments for GetPileupSummaries when running exomes - are these the correct resources for genomes as well? Or is there another resource you recommend?




  • akovalskakovalsk Member, Broadie, Moderator admin

    Hi @michaelmc thanks for your question.

    Those should be appropriate, but in general we recommend carefully considering the intervals (or the subsetting of the variant file) as we expect off-target reads to be poorly or even erroneously covered.

  • Thanks for the answer - I was asking mainly because it was recommended to use the same file (in most cases) for both -V and -L options in GetPileupSummaries in this thread:


    Is the ExAC resource suitable for genomes in this case? Or would gnomAD or another resource that incorporates whole genome data be appropriate?
  • akovalskakovalsk Member, Broadie, Moderator admin

    Hi @michaelmc yes that resource is suitable.

Sign In or Register to comment.