Service note: Geraldine will be on vacation March 7-16. During this time, the team may not be able to answer your questions and comments in the forum. Regular service will resume March 17. Apologies for any inconvenience!
Important note: with this release the GATK has officially moved to using Java 7.
Small runtime performance improvements contributed by Michael McCowan.
Added fix for the "Removed too many insertions, header is now negative" bug.
Fixed bug that arises in multi-sample mode and causes the tool to crash.
Added --cancer_mode argument to force the user to explicitly enable multi-sample mode.
Runtime performance improvements when calling indels; calling indels in a single sample is almost 2x faster in our tests.
Fixed bug for bad AD values in some cases.
Fixed bug for GENOTYPE_GIVEN_ALLELES mode where it silently fails to genotype indels in some cases.
We have been working hard to reduce the number of false negatives (i.e. missed sites) for the Haplotype Caller and as such added a bunch of improvements to this tool. The sensitivity is now better than that of the Unified Genotyper is all of our whole genome tests for both SNPs and indels. Feel free to peruse the detailed version history for more information.
The Haplotype Caller now annotates IDs from dbSNP properly.
The Haplotype Caller now emits per-sample DP.
Fixed bug for bad AD values in some cases.
Fixed bug with error: "Only one of refStart or refStop must be < 0, not both" that arose from soft-clipped reads at the beginning of contigs.
Implemented a much improved version of GENOTYPE_GIVEN_ALLELES mode in the Haplotype Caller that works so much better.
Fixed bug where secondary alignments were not being handled correctly.
Added an overall genotype concordance metric to the output.
Fixed a bug in the printout of molten data in how it treated the genotypes.
Diagnose Targets now has an option to output missing intervals.
Fixed bug where sometimes intervals were emitted out of order.
Fixed bug for reads with indel CIGAR operators (I or D) at the start/end of the read.
Introduced a new tool, AnalyzeCovariates, to generate the BQSR quality assessment plots as a separate step, instead of doing it through the BaseRecalibrator.
We no longer add PASS to the FILTER field of unfiltered records.
The RMSMappingQuality annotation now works properly with reduced reads.
The various rank sum tests no longer use reduced reads in their calculations (because those reads do not represent distinct observations).
Fixed bug in the BaseQualityRankSumTest annotation where it was not actually using the base qualities.
Added a new annotation DepthPerSampleHC that is used by default in the HaplotypeCaller.
James Warren contributed a patch to have references with non-suffix ".fa" parse correctly.
We now emit the GATK version number in the header of VCFs that we produce.
Fixed bug in the up front downsampling used by the GATK: reduced reads are no longer allowed to be eliminated during downsampling.
dbSNP rsID matching is now smarter: variants are considered matching if they have the same reference allele and at least 1 common alternative allele.
We now warn users about using the GATK with RNA-seq data.
We now check that -compress arguments are within allowable range 0-9.
-rf ReassignMappingQuality can now be used to reassign mapping qualities to 60 before the engine filters them out with MappingQualityUnassigned.
Fixed bug where requesting gzip VCF output with multi-threading was causing the GATK to fail.
We now require a minimum -dcov value of 200 for Locus and ActiveRegion walkers when downsampling to coverage.
Zero-length and repeated cigar elements are collapsed down by default in the engine.
-ds option removed from PrintReads because it was redundant with the engine-level -dfrac argument.
Fixed bug where the --defaultBaseQualities argument didn't always work.
The engine now produces much more accurate read counts for Read traversals.
Count Reads now uses a Long instead of an Integer for counts to prevent overflows.
Locus Walkers now only try to clip adaptors when both reads of the pair are on opposite strands.
Fixed VCF issue where PLs were capped at 32767.
Picard/Tribble/Variant jars updated to version 1.91.1453.
Post edited by ebanks on
Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT