We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Information of I/O libraries used in source code of GATK4 for achieving runtime IO optimization

Dear GATK Team,

I want to perform IO level runtime optimization for GATK4 in an distributed environment. For this reason I need to know what are the the IO libraries used in GATK4 modules. I did not get any material or relevant post regarding any of the GATK 4 modules in the forum.

Kindly inform or redirect me to relevant resources for the same.

Thank you

Regards

Abhishek Panda

Best Answer

  • cnormancnorman United States ✭✭
    Accepted Answer

    @abhishekpanda GATK uses a quite a few different methods and libraries to access data, depending on the tool, type of data, data store, and type of file system in which the data resides (local disk storage, cloud storage, file system type). I don't think there is a simple, single answer. All of the code is open source though, and the dependent libraries are listed in the gradle file.

Answers

  • cnormancnorman United StatesMember, Broadie, Dev ✭✭
    Accepted Answer

    @abhishekpanda GATK uses a quite a few different methods and libraries to access data, depending on the tool, type of data, data store, and type of file system in which the data resides (local disk storage, cloud storage, file system type). I don't think there is a simple, single answer. All of the code is open source though, and the dependent libraries are listed in the gradle file.

  • abhishekpandaabhishekpanda IndiaMember
    Thank you @cnorman. I will look into gradle file.

    For lustre parallel filesystem, do you have any IO optimization suggestions for making GATK performant.
Sign In or Register to comment.