The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Patch for BadCigar filtering on Novoalign reads containing zero length CIGAR elements

chapmanbchapmanb Boston, MAMember

I'm running into a HaplotypeCaller issue with the latest release (2.5-2) using
Novoalign input reads. Here's a small reproducible input file:

https://s3.amazonaws.com/chapmanb/gatk_hc_problem_cigar.bam

Running:

java -Xms750m -Xmx3g -jar GenomeAnalysisTK.jar -R GRCh37.fa -I
problem_cigar.bam -L 4:120371315-120371586 -T HaplotypeCaller -o out.vcf
--read_filter BadCigar -debug

Errors out with:

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (0) >
(-1) STOP -- this should never happen, please check read:
HWI-ST1124:106:C15APACXX:1:1107:15450:87092 2/2 58b aligned read. (CIGAR: 38H4D58M)

Looking at the read, the CIGAR string appears to be tricking the BadCigar
filter, since it has a 0M element between an insertion and deletion:

38M4I0M4D58M

This patch fixes the BadCigar filter by only considering CIGAR elements with
non-zero length:

https://gist.github.com/chapmanb/5568411

With this applied, the read will be properly filtered and HaplotypeCaller can
continue without a problem. Hope this helps, please let me know if any other
detail about the problem would be helpful.

Comments

  • CarneiroCarneiro Charlestown, MAMember

    Hi Chapman, thank you for identifying it and sending the patch. I will create a test internally and review your patch soon.

    Thanks.

  • chapmanbchapmanb Boston, MAMember

    Mauricio;
    Thanks much for looking at this. Is there any other information I can provide to help get this into either a 2.5.x release or 2.6? I'm doing comparison tests with 2.5 and would love to be able to share and reproduce without requiring others to grab my patched copy of 2.5. Thanks again.

  • Mark_DePristoMark_DePristo Broad InstituteMember

    Sorry, my comments must have been lost somewhere on the forum. I've reviewed the patch and am happy with it, but I cannot actually apply the patch. patch -p1 fails with an error when I grab your diff. Can you issue a pull request, or send us a standard patch? We can apply into 2.6 so the nightly will have it and 2.6 will go live with it.

  • chapmanbchapmanb Boston, MAMember

    Mark;
    Thanks for looking at this and sorry about the patch issues. I'm not sure what happened: must be some strangeness with the whitespace changes. It's a simple diff but the shifting of the internal block after the if statement makes it seem more complicated. Here's a pull request on GitHub:

    https://github.com/broadgsa/gatk/pull/4

    Thanks again

Sign In or Register to comment.