The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.4 has MAJOR CHANGES that impact throughput of pipelines. Default compression is now 1 instead of 5, and Picard now handles compressed data with the Intel Deflator/Inflator instead of JDK.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

Hello I'm a developer in Korea

I'm a developer in Korea.
Recently, I have been developed about Bioinformatics pipeline.
I'm using BWA, Samtools, Picard, GATK. And then I wanna make this tool on hadoop. The reason is why Using MR is efficient to speed or memory something like that.
So, I know GATK is made by MR. If so, did you test GATK on MR?
In theory, that is more efficient than just GATK.

And, If GATK needs indexed and sorted SAM, with using hadoop-BAM library do I just make index and sort??

Because I am novice in Bioinformatics, this issue is too complicated to me.


e-mail :
phone : +821027266808


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi there,

    I'm sorry but I don't understand what is your question. When you say MR, do you mean Map-Reduce? If so, yes, GATK is built on a Map-Reduce strategy. I cannot give you any information on how to use it with Hadoop, however -- that is outside the scope of the support we provide.

    Let me know if you have any specific questions about how to use the GATK.

  • Hello
    Thank you for replying letter.
    I means that the architecture of GATK is made by MR internally as My expression. If so, can GATK used on MR outernally???(GATK is ran by MR)
    In that case, there are two MR processing.
    So we can compare with two case.
    First. Just use GATK
    Second. Use GATK on MR
    My question is that Is Second case possible? If so, can you challenge that?

  • CarneiroCarneiro Charlestown, MAMember

    Yes, the GATK uses a map-reduce like framework externally through Queue.

  • Thank you for replying.
    uhm.. my question means that can GATK run on MR? (yes, MR means MapReduce-Distributed computing)
    What different with GATK existing is that GATK use MR framework not on MR.
    GATK is efficient fully, but I guess GATK on MR is more efficient, isn't it??

  • CarneiroCarneiro Charlestown, MAMember

    GATK has several levels of parallelism. Read up on this document :

Sign In or Register to comment.