Patch for BadCigar filtering on Novoalign reads containing zero length CIGAR elements

chapmanbchapmanb Posts: 19Member

I'm running into a HaplotypeCaller issue with the latest release (2.5-2) using Novoalign input reads. Here's a small reproducible input file:


java -Xms750m -Xmx3g -jar GenomeAnalysisTK.jar -R GRCh37.fa -I
problem_cigar.bam -L 4:120371315-120371586 -T HaplotypeCaller -o out.vcf
--read_filter BadCigar -debug

Errors out with:

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (0) >
(-1) STOP -- this should never happen, please check read:
HWI-ST1124:106:C15APACXX:1:1107:15450:87092 2/2 58b aligned read. (CIGAR: 38H4D58M)

Looking at the read, the CIGAR string appears to be tricking the BadCigar filter, since it has a 0M element between an insertion and deletion:


This patch fixes the BadCigar filter by only considering CIGAR elements with non-zero length:

With this applied, the read will be properly filtered and HaplotypeCaller can continue without a problem. Hope this helps, please let me know if any other detail about the problem would be helpful.

Brad Chapman, Bioinformatics Core at Harvard School of Public Health


Sign In or Register to comment.