Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

UnifiedGenotyper glm mode INDEL error

pylpyl Posts: 9Member
edited October 2012 in Ask the GATK team

Hi,

I'm running the UnifiedGenotyper (gatk version 2.0-35) with -glm INDEL on a dataset. When I use -glm SNP everything works fine and I get my SNV calls, but if I use INDEL I get the following Error message:

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (149) > (100) STOP -- this should never happen -- call Mauricio! at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinates(ReadClipper.java:512) [...]

It looks like something is up with the read coordinates. I ran this particular script with -L 1 to limit to chromosome 1 but when I check the first and last reads of this chromosome everything seems to be in order (at least they map within the chromosome coordinates).

Cheers, Paul

Post edited by Geraldine_VdAuwera on

Best Answer

Answers

  • pylpyl Posts: 9Member

    I just managed to narrow it down to one specific position in which it happens and it only happens if I have two input files file1.bam and file2.bam, for each file separately it runs fine. Can't seem to find anything unusual about it thoug, the pileups look like this:

    file1.bam 1 15436436 N 74 <>>>>>>>><<<>t$Tt-6nnnnnn<tTTTtTtt<TtTTTTtttTTt+12ttgtgtgtttgtTTt+12ttgtgtgtttgtTt+12ttgtgtgtttgttt+12ttgtgtgtttgttt+12ttgtgtgtttgtttTttttttttttTttTtTttTTtt^IT^IT CDFF#5HJG@IIH@AHJCBHBIHJBJEJCDC6JJJIFJ?9JFJJJJIJJHJE8EBDHHHE+HDHDHDBDDDD@@

    file2.bam

    1 15436436 N 45 >><><>T>><>tTtTTTtTTTT+12TTGTGTGTTTGTt+12ttgtgtgtttgtT+12TTGTGTGTTTGTT+12TTGTGTGTTTGTT+12TTGTGTGTTTGTTt+12ttgtgtgtttgtt+12ttgtgtgtttgtt+12ttgtgtgtttgtt+12ttgtgtgtttgttTtTtttttttTT^It FHIIJHFJJDIGEJ@BADHCEHGHIGCJJJJJFJ<FDDDDDDDDD

    Could it be the "-6nnnnnn" in file1.bam? I don't see anything unusual happen in the bam files otherwise, but I would guess it has something to do with one file having a variant and the process of determining the genotype of the other file at that locus.

    Cheers, Paul

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,408Administrator, GATK Developer admin

    Hi Paul, did you process your file with ReduceReads by any chance?

    Geraldine Van der Auwera, PhD

  • pylpyl Posts: 9Member

    Hi Geraldine, I did not do that. I thought it was an optional step and so far I have no problems with storage so I decided to skip it.

    I ran Picard's MarkDuplicates, extracted primary alignments vis 'samtools -F 256 [...]' (I then read that the UG filters for that anyway so it was probably unnecessary) and ran the IndelRealigner. The resulting file is the input for my UG run.

    Cheers, Paul

  • pylpyl Posts: 9Member

    Thanks, that did the trick! Cheers, Paul

  • ichornyichorny Posts: 1

    I just got the same error message with the 1.4-30-gf2ef8d1 version of GATK. It's a very cryptic message. What did you do to solve the problem? What causes this message?

    Thanks,

    Ilya

  • ebanksebanks Posts: 683GATK Developer mod

    To solve the problem you need to update to the latest version of the GATK.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • pylpyl Posts: 9Member

    @ebanks said: To solve the problem you need to update to the latest version of the GATK.

    That is true for this specific instance but the error persists with other files I have even in GATK 2.1-12. I haven't gotten around to checking in 2.1-13.

    In my cases it's also usually specific to a single variant call, i.e. with the -L option ( and some sneaky use of nested intervals ) you can find the offending coordinate and then try to figure out what is going on or simply split your chromosome at that position and call everything before and after it.

  • ebanksebanks Posts: 683GATK Developer mod

    If you can narrow it down to a specific interval that fails, please upload that contained region (using PrintReads to generate a smaller bam for just that interval) and we will definitely take a look.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

Sign In or Register to comment.