Release notes for GATK version 2.3

ebanksebanks Broad InstitutePosts: 689Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin
edited December 2012 in Announcements

GATK 2.3 was released on December 17, 2012. Highlights are listed below. Read the detailed version history overview here:

Base Quality Score Recalibration

  • Soft clipped bases are no longer counted in the delocalized BQSR.
  • The user can now set the maximum allowable cycle with the --maximum_cycle_value argument.

Unified Genotyper

  • Minor (5%) run time improvements to the Unified Genotyper.
  • Fixed bug for the indel model that occurred when long reads (e.g. Sanger) in a pileup led to a read starting after the haplotype.
  • Fixed bug in the exact AF calculation where log10pNonRefByAllele should really be log10pRefByAllele.

Haplotype Caller

  • Fixed the performance of GENOTYPE_GIVEN_ALLELES mode, which often produced incorrect output when passed complex events.
  • Fixed the interaction with the allele biased downsampling (for contamination removal) so that the removed reads are not used for downstream annotations.
  • Implemented minor (5-10%) run time improvements to the Haplotype Caller.
  • Fixed the logic for determining active regions, which was a bit broken when intervals were used in the system.

Variant Annotator

  • The FisherStrand annotation ignores reduced reads (because they are always on the forward strand).
  • Can now be run multi-threaded with -nt argument.

Reduce Reads

  • Fixed bug where sometime the start position of a reduced read was less than 1.
  • ReduceReads now co-reduces bams if they're passed in toghether with multiple -I.

Combine Variants

  • Fixed the case where the PRIORITIZE option is used but no priority list is given.

Phase By Transmission

  • Fixed bug where the AD wasn't being printed correctly in the MV output file.


  • A brand new version of the per site down-sampling functionality has been implemented that works much, much better than the previous version.
  • More efficient initial file seeking at the beginning of the GATK traversal.
  • Fixed the compression of VCF.gz where the output was too big because of unnecessary call to flush().
  • The allele biased downsampling (for contamination removal) has been rewritten to be smarter; also, it no longer aborts if there's a reduced read in the pileup.
  • Added a major performance improvement to the GATK engine that stemmed from a problem with the NanoSchedule timing code.
  • Added checking in the GATK for mis-encoded quality scores.
  • Fixed downsampling in the ReadBackedPileup class.
  • Fixed the parsing of genome locations that contain colons in the contig names (which is allowed by the spec).
  • Made ID an allowable INFO field key in our VCF parsing.
  • Multi-threaded VCF to BCF writing no longer produces an invalid intermediate file that fails on merging.
  • Picard jar remains at version 1.67.1197.
  • Tribble jar updated to version 119.
Post edited by Geraldine_VdAuwera on

Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT


  • severinseverin Posts: 5Member

    In the new release for the HaplotypeCaller function does downsampling do anything?
    Can we still use --enable_experimental_downsampling?

    What is your recommendation.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,428Administrator, GATK Dev admin

    You no longer need to use --enable_experimental_downsampling for anything; the experimental downsampling is now the regular downsampling (see Version history/Version highlights for details) and is used by default by all tools that downsample reads.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.