Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

Information of I/O libraries used in source code of GATK4 for achieving runtime IO optimization

Dear GATK Team,

I want to perform IO level runtime optimization for GATK4 in an distributed environment. For this reason I need to know what are the the IO libraries used in GATK4 modules. I did not get any material or relevant post regarding any of the GATK 4 modules in the forum.

Kindly inform or redirect me to relevant resources for the same.

Thank you

Regards

Abhishek Panda

Best Answer

  • cnormancnorman United States ✭✭
    Accepted Answer

    @abhishekpanda GATK uses a quite a few different methods and libraries to access data, depending on the tool, type of data, data store, and type of file system in which the data resides (local disk storage, cloud storage, file system type). I don't think there is a simple, single answer. All of the code is open source though, and the dependent libraries are listed in the gradle file.

Answers

  • cnormancnorman United StatesMember, Broadie, Dev ✭✭
    Accepted Answer

    @abhishekpanda GATK uses a quite a few different methods and libraries to access data, depending on the tool, type of data, data store, and type of file system in which the data resides (local disk storage, cloud storage, file system type). I don't think there is a simple, single answer. All of the code is open source though, and the dependent libraries are listed in the gradle file.

  • Thank you @cnorman. I will look into gradle file.

    For lustre parallel filesystem, do you have any IO optimization suggestions for making GATK performant.
Sign In or Register to comment.