Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Recommendation on performance using scatter/gather

LaurentLaurent Member, Broadie ✭✭
edited April 2013 in Ask the GATK team

Dear all,

I am currently running an analysis using the HaplotypeCaller on 300 large BAM files on our cluster and decided to chunk the the genome in 3MB bins in order for them to be processed in a decent time. I'm however experiencing very long runtimes as more and more jobs get scheduled to run in parallel on the same files. Looking at the GATK options, I saw these 2 that I thought could be of help and was wondering what were the recommendation for using them:

More precisely, does the num_bam_file_handles increase processing time by a lot? and what is the default value for --read_buffer_size ?

Thanks a lot,


Best Answer


  • LaurentLaurent Member, Broadie ✭✭

    Hi Mark,

    Thanks for the explanation! I will digg more into the problem, but at the moment what I am reporting is only "observation" of my runtimes getting extremely high when running in parallel. I have observed similar problems when running other walkers using scatter/gather in the past on our cluster. I'll give a shot at extracting the regions using PrintReads to a local scratch beforehand and let you know if this helps! I'll also use the nightly build to benefit from the latest improvements.

Sign In or Register to comment.