The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
Register now for the upcoming GATK Best Practices workshop, Nov 7-8 at the Broad in Cambridge, MA. Open to all comers! More info and signup at

Hello I'm a developer in Korea

sangrholeesangrholee Posts: 7Member

I'm a developer in Korea.
Recently, I have been developed about Bioinformatics pipeline.
I'm using BWA, Samtools, Picard, GATK. And then I wanna make this tool on hadoop. The reason is why Using MR is efficient to speed or memory something like that.
So, I know GATK is made by MR. If so, did you test GATK on MR?
In theory, that is more efficient than just GATK.

And, If GATK needs indexed and sorted SAM, with using hadoop-BAM library do I just make index and sort??

Because I am novice in Bioinformatics, this issue is too complicated to me.


e-mail :
phone : +821027266808


  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,557Administrator, Dev admin

    Hi there,

    I'm sorry but I don't understand what is your question. When you say MR, do you mean Map-Reduce? If so, yes, GATK is built on a Map-Reduce strategy. I cannot give you any information on how to use it with Hadoop, however -- that is outside the scope of the support we provide.

    Let me know if you have any specific questions about how to use the GATK.

    Geraldine Van der Auwera, PhD

  • sangrholeesangrholee Posts: 7Member

    Thank you for replying letter.
    I means that the architecture of GATK is made by MR internally as My expression. If so, can GATK used on MR outernally???(GATK is ran by MR)
    In that case, there are two MR processing.
    So we can compare with two case.
    First. Just use GATK
    Second. Use GATK on MR
    My question is that Is Second case possible? If so, can you challenge that?

  • CarneiroCarneiro Posts: 274Administrator, Dev admin

    Yes, the GATK uses a map-reduce like framework externally through Queue.

  • sangrholeesangrholee Posts: 7Member

    Thank you for replying.
    uhm.. my question means that can GATK run on MR? (yes, MR means MapReduce-Distributed computing)
    What different with GATK existing is that GATK use MR framework not on MR.
    GATK is efficient fully, but I guess GATK on MR is more efficient, isn't it??

  • CarneiroCarneiro Posts: 274Administrator, Dev admin

    GATK has several levels of parallelism. Read up on this document :

Sign In or Register to comment.