It looks like you're new here. If you want to get involved, click one of these buttons!
Hi,
I'm running the UnifiedGenotyper (gatk version 2.0-35) with -glm INDEL on a dataset. When I use -glm SNP everything works fine and I get my SNV calls, but if I use INDEL I get the following Error message:
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (149) > (100) STOP -- this should never happen -- call Mauricio! at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinates(ReadClipper.java:512) [...]
It looks like something is up with the read coordinates. I ran this particular script with -L 1 to limit to chromosome 1 but when I check the first and last reads of this chromosome everything seems to be in order (at least they map within the chromosome coordinates).
Cheers, Paul
Geraldine_VdAuwera
Posts: 2,238 admin
I see -- just checking, and yes you're right that it's an optional step.
This looks like something that was recently fixed, could you try upgrading to the latest version of GATK, run the same command again and tell me if the problem persists?
Geraldine Van der Auwera, PhD
Answers
I just managed to narrow it down to one specific position in which it happens and it only happens if I have two input files file1.bam and file2.bam, for each file separately it runs fine. Can't seem to find anything unusual about it thoug, the pileups look like this:
file1.bam 1 15436436 N 74 <>>>>>>>><<<>t$Tt-6nnnnnn<tTTTtTtt<TtTTTTtttTTt+12ttgtgtgtttgtTTt+12ttgtgtgtttgtTt+12ttgtgtgtttgttt+12ttgtgtgtttgttt+12ttgtgtgtttgtttTttttttttttTttTtTttTTtt^IT^IT CDFF#5HJG@IIH@AHJCBHBIHJBJEJCDC6JJJIFJ?9JFJJJJIJJHJE8EBDHHHE+HDHDHDBDDDD@@
file2.bam
1 15436436 N 45 >><><>T>><>tTtTTTtTTTT+12TTGTGTGTTTGTt+12ttgtgtgtttgtT+12TTGTGTGTTTGTT+12TTGTGTGTTTGTT+12TTGTGTGTTTGTTt+12ttgtgtgtttgtt+12ttgtgtgtttgtt+12ttgtgtgtttgtt+12ttgtgtgtttgttTtTtttttttTT^It FHIIJHFJJDIGEJ@BADHCEHGHIGCJJJJJFJ<FDDDDDDDDD
Could it be the "-6nnnnnn" in file1.bam? I don't see anything unusual happen in the bam files otherwise, but I would guess it has something to do with one file having a variant and the process of determining the genotype of the other file at that locus.
Cheers, Paul
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Hi Paul, did you process your file with ReduceReads by any chance?
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Hi Geraldine, I did not do that. I thought it was an optional step and so far I have no problems with storage so I decided to skip it.
I ran Picard's MarkDuplicates, extracted primary alignments vis 'samtools -F 256 [...]' (I then read that the UG filters for that anyway so it was probably unnecessary) and ran the IndelRealigner. The resulting file is the input for my UG run.
Cheers, Paul
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks, that did the trick! Cheers, Paul
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •I just got the same error message with the 1.4-30-gf2ef8d1 version of GATK. It's a very cryptic message. What did you do to solve the problem? What causes this message?
Thanks,
Ilya
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •To solve the problem you need to update to the latest version of the GATK.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •That is true for this specific instance but the error persists with other files I have even in GATK 2.1-12. I haven't gotten around to checking in 2.1-13.
In my cases it's also usually specific to a single variant call, i.e. with the -L option ( and some sneaky use of nested intervals ) you can find the offending coordinate and then try to figure out what is going on or simply split your chromosome at that position and call everything before and after it.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •If you can narrow it down to a specific interval that fails, please upload that contained region (using PrintReads to generate a smaller bam for just that interval) and we will definitely take a look.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •