The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block as demonstrated here.

GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

# Version highlights for GATK version 3.8

edited July 29

One more 3.x version, for the road! That's right, even as we're ramping up our efforts on GATK4 (we're three beta releases in at this point, and getting down to brass tacks writing the migration guide ahead of the 4.0 general release) we still found it worthwhile to cut one last release of GATK3.

Our main motivation here is to introduce the Intel Genomics Kernel Library, which comes bearing the gift of speed improvements for those of you who won't be able to migrate to GATK4 right away.

As a secondary benefit, this version includes a handful of bug fixes, some usability improvements including better error messages, documentation fixes and logging tweaks, and a few improvements to annotation calculations (especially in allele-specific mode), which you'll find described briefly in the release notes. No big changes though, except perhaps the new default behavior of VariantsToTable with regard to missing annotation values, discussed below. Finally, we've committed a copy of all the peripheral documentation (= the docs that live in the forum and complement the tool documentation) to the now-old GATK codebase.

And thus, the last-ever GATK3 version emerges covered in carbonite.

### Introducing the Intel Genomics Kernel Library

The Genomics Kernel Library or GKL is an open-source library developed by our collaborators at Intel that provides accelerated versions of algorithms, i.e. "kernels", used in genomics tools. These kernels are optimized to run on Intel Architecture under 64-bit Linux and Mac OSX. They're plugged into the GATK in such a way that they will be automatically used if your computing hardware supports them, but if it doesn't they will remain inactive and the "default" generic Java versions will be used instead.

At the moment there are three main kernels included:

• Intel inflater/deflater: a file compression/decompression kernel that provides different levels of compression (with correspondingly variable speedups). This replaces the JDK inflater/deflater and is now activated by default. It can be disabled by using the -jdk_deflater and -jdk_inflater flags.

• Intel chip optimization for PairHMM: a version of the PairHMM algorithm used by HaplotypeCaller to calculate genotype likelihoods that runs faster on Intel hardware. It can be disabled by setting -pairHMM LOGLESS_CACHING, for example if you need completely deterministic behavior across different machine types (at the expense, of course, of speed).

• FPGA support for PairHMMM: another version of the PairHMM algorithm, this one designed to run on FPGAs, which are a type of processor that is gaining popularity for computing applications that require extremely high speed. The FPGA support in this version is fairly experimental so we can't guarantee results, but if you have access to this specialized hardware we definitely encourage you to try it out and let us know how it goes.

VariantsToTable is a tool we're quite fond of because it allows us to extract just the information we want from VCFs when we want to probe a callset interactively, typically for filtering purposes. Previously we had to tell it explicitly not to freak out if it came across any sites or genotypes where an annotation we requested was missing; but realistically, there are always some sites for which we can't calculate some annotations (like ranksum annotations at sites where we don't have any heterozygous samples), so that was annoying. Now we've flipped the behavior so that by default the tool keeps going and just outputs "NA" anywhere it encounters such sites or genotypes, unless you specify that it should freak out by using the --errorIfMissingData` flag.

### Documentation archive and deprecation plans

In preparation for the general release of GATK4 (in the form of a 4.0 version), we made a copy of all the peripheral (forum-based) documentation in its current state and archived it in the codebase itself here. This is intended to be a permanent archive for documentation that we are phasing out in favor of GATK4-focused documentation.

Our ultimate goal is to provide some degree of continuity and support for users who cannot migrate to GATK4 right away and must continue to use older versions, without leaving too much clutter around that might confuse everyone else.

In the immediate future we will delete three sets of documents from the forum (and therefore from the website):

• "Developer Zone": replaced in GATK4 by a developer-oriented Wiki in the github repository;
• "Queue": superseded for all versions by Cromwell+WDL;
• The current contents of "Archive", which have typically been replaced by individual articles linked at the top of the deprecated article.

Within the other documentation sections, articles may get updated in place or moved to the Archive for future removal. Versioned tool documentation going back to 3.5-0 will remain available on the website for the foreseeable future. For older versions, the documentation can be built from source. Finally, the Best Practices section of the website will be updated to reflect the new world order once GATK 4.0 is released and becomes the officially supported version of GATK. Going forward we'll have versioned Best Practices accompanied by a publicly available WDL script for each major use case. We'll post more details of what this will look like in the coming weeks.

Tagged:

• KielMember

Hi,

is there are list which intel-cpu will support GKL ?
I allways need a reason for my boss to buy new hardware

Hah, no kidding

I'm not aware of any such list but if you're interested I can put you in touch with the folks at Intel who can best tell you that.

• TurkeyMember

Any AVX compatible Intel CPU (Sandybridge Sandybridge EP Core i3 i5 i7 /Xeon E3 E5... and above) should do a decent acceleration I think. I have seen a nice boost after 3.8 even when I don't use multithreading in most of my workflow (I don't use multithreading other than BWA and BQSR because I need the bamout in HC and I noticed that (with my humble testing of course YMMV) concurrent sample workflows are faster than multithreading a single sample with all you have. [4 WES samples are completed with annotation and all QC extras in 5 hours on average])

• KielMember

4 WES samples in 5 hours that sound fast, can you give me a ruff description of the system which you are using (cpu/mem) ?

@Geraldine_VdAuwera
That would be nice, even it would be interesting which intel cpu supporting FPGA right now.

Another question is if Mutect2 also profit from the faster PairHMM calculation.

• TurkeyMember
edited August 11