Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

BaseRecalibrator in 2.4-9 RUNTIME ERROR

brdidobrdido Posts: 23Member
edited March 2013 in Ask the GATK team

Hello,

sorry if i missed the same problem in other threads in the forum... but we are having trouble running BaseRecalibrator in a sample and i couldn't find the solution.

I tried many steps and here is what i've found until now:

1 - Other samples work fine

2 - Running picard ValidateSamFile in realigned.bam (after IndelRealigner) gives many erros :
2a - Mate negative strand flag does not match read negative strand flag of mate
2b - Mate alignment does not match alignment start of mate
3c - Value was put into PairInfoMap more than once. (fatal)

3 - Running BaseRecalibrator with option -L 1:428-249250621 works fine!

After the fact that -L works fine i discarded the problem in vcf files and reference file. I don't know how to go further in this investigation since GATK 1 realined.bam also gives me the errors in (2) and those error are peanuts comparing the total number of reads.

The big difference here is that we're are using bwa7.

Any ideas? Thanks!

(i'm filtering out "secondary hits" given by bwa7 and will update this thread, if it works it may be helpful in the future)

GATK output:

INFO 14:11:47,441 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,443 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.4-9-g532efad, Compiled 2013/03/19 07:35:36 INFO 14:11:47,443 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 14:11:47,443 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 14:11:47,447 HelpFormatter - Program Args: -nct 8 -T BaseRecalibrator -I /mnt/work/rlb/pac661825//OUT_661825.realigned.bam -R ../data/databases//1KGP/GRCh37_female_exome_mt1kg.fasta --knownSites ../data/databases//dbSNP/dbSNP_137/00-All.vcf -o /mnt/work/rlb/pac661825//OUT_661825.grp INFO 14:11:47,447 HelpFormatter - Date/Time: 2013/03/26 14:11:47 INFO 14:11:47,447 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,447 HelpFormatter - -------------------------------------------------------------------------------- INFO 14:11:47,458 ArgumentTypeDescriptor - Dynamically determined type of ../data/databases/dbSNP/dbSNP_137/00-All.vcf to be VCF INFO 14:11:47,500 GenomeAnalysisEngine - Strictness is SILENT INFO 14:11:47,558 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 14:11:47,565 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:11:47,577 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 14:11:47,587 RMDTrackBuilder - Loading Tribble index from disk for file ../data/databases/dbSNP/dbSNP_137/00-All.vcf INFO 14:11:47,704 MicroScheduler - Running the GATK in parallel mode with 8 total threads, 8 CPU thread(s) for each of 1 data thread(s), of 8 processors available on this machine INFO 14:11:47,745 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 14:11:47,750 GenomeAnalysisEngine - Done creating shard strategy INFO 14:11:47,750 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 14:11:47,750 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 14:11:47,773 BaseRecalibrator - The covariates being used here:
INFO 14:11:47,773 BaseRecalibrator - ReadGroupCovariate INFO 14:11:47,773 BaseRecalibrator - QualityScoreCovariate INFO 14:11:47,773 BaseRecalibrator - ContextCovariate INFO 14:11:47,774 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 14:11:47,774 BaseRecalibrator - CycleCovariate INFO 14:11:47,776 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 14:11:47,777 ReadShardBalancer$1 - Done loading BAM index data for next contig INFO 14:12:18,626 ProgressMeter - 1:15956928 1.10e+06 30.0 s 28.0 s 0.5% 95.1 m 94.6 m INFO 14:12:48,655 ProgressMeter - 1:34102053 2.70e+06 60.0 s 22.0 s 1.1% 89.0 m 88.0 m INFO 14:13:18,685 ProgressMeter - 1:59096606 4.50e+06 90.0 s 20.0 s 1.9% 77.1 m 75.6 m INFO 14:13:48,714 ProgressMeter - 1:103467532 5.90e+06 120.0 s 20.0 s 3.4% 58.7 m 56.7 m INFO 14:14:18,745 ProgressMeter - 1:153234111 7.50e+06 2.5 m 20.0 s 5.0% 49.5 m 47.0 m INFO 14:14:48,774 ProgressMeter - 1:172414433 9.30e+06 3.0 m 19.0 s 5.7% 53.1 m 50.1 m INFO 14:15:19,054 ProgressMeter - 1:208266349 1.10e+07 3.5 m 19.0 s 6.9% 51.3 m 47.8 m INFO 14:15:49,095 ProgressMeter - 1:247611815 1.27e+07 4.0 m 19.0 s 8.2% 49.3 m 45.2 m INFO 14:15:56,507 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (0) > (-1) STOP -- this should never happen -- call Mauricio! at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinates(ReadClipper.java:537) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinatesLeftTail(ReadClipper.java:176) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:389) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:392) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:244) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:131) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:230) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:218) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler$ReadMapReduceJob.run(NanoScheduler.java:471) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: START (0) > (-1) STOP -- this should never happen -- call Mauricio!
ERROR ------------------------------------------------------------------------------------------
Post edited by Geraldine_VdAuwera on

Best Answers

Answers

  • brdidobrdido Posts: 23Member

    I forgot to say that it would be nice to give us Mauricio's phone number! :D

  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin
    edited March 2013

    it's 'call mauricio' in the epic sense of the word. Like scream to the mountains calling for him.

    Post edited by Carneiro on
  • brdidobrdido Posts: 23Member
    edited March 2013

    and he replies!

    So, i was wrong about the locus regarding the problem. It's in 2:7273810-7273820, which still gives me the problem and i was able to filter the reads out of the bam and picardTools ValidateSam gives me:

    ERROR: Read name HWI-ST993:370:D1PHEACXX:3:1206:13326:14308, Mate not found for paired read ERROR: Read name HWI-ST993:370:D1PHEACXX:3:1211:15788:99570, Mate not found for paired read

    Am I on the right path?

    Post edited by brdido on
  • brdidobrdido Posts: 23Member

    I did a mistake, please do not consider the last comment. I did a wrong "small bam" with the reads, their pairs exists... So i'm back with the problem in 2:7273810-7273820 locus....

  • brdidobrdido Posts: 23Member

    The original reference in gatk bundle gives me the same error.

  • brdidobrdido Posts: 23Member
    edited March 2013

    I tested with all reads and filtered out those two above, it goes until chr10 until next similar error. So, it must be something in my pipeline before GATK? This was done with bwa0.7 mem and sorted with novosort then IndelReligner.

    Here goes the reads filtered out (that i believe are the problematics):

    HWI-ST993:370:D1PHEACXX:3:1206:13326:14308 147 2 7273691 42 24D101M = 7273694 -122 GGTACTCTGCTGGGCTGGTTCCCTGGTACTCTGCTGGGCTGGTTCCCTGGTACTCAGCTGGCCTGGCCCCCTGCTGCTCTGCTAGCCTCATCCCCTGCCAC AA>:@DBCD@BDBABA?BBDB@>@CAC@DDBDBBBDDDDBB@DBDCACC?7?BFHEEACIGIIHIIJJJJIJIJGEGGCJJIIJJJJJGHHHDFDDFF@@@ RG:Z:661825 NM:i:25 MQ:i:42 AS:i:66 XS:i:54 HWI-ST993:370:D1PHEACXX:3:1206:13326:14308 99 2 7273694 42 101M = 7273691 122 ACTCTGCTGGGCTGGTTCCCTGGTACTCTGCTGGGCTGGTTCCCTGGTACTCTGCTGGGCTGGTTCCCTGGTACTCAGCTGGCCTGGCCCCCTGCTGCTCT @CCFFFFFGGHHHJDBGGIJJJJHGIJJJEHIIHIJIIJJJJJJJJJFHIGHIJIJJ@CHIIJGHIEHHHEBCFE?DAEEEDDDDBDDDDDDDDDDCDCDA RG:Z:661825 NM:i:1 MQ:i:42 AS:i:96 XS:i:77 HWI-ST993:370:D1PHEACXX:3:1211:15788:99570 163 2 7273815 60 101M = 7273862 148 CTCTGCTGGCCTGGGCCCCCTGCTGGCTGCCCATGCCTCTCTTTCCTGCATGTGTCATGTCCTCTCTATTCCTTGAATGTGCTGAGCTGTGAGCACTCCAG @@CFFFFFFD?FFIIJIJJJJJFIJIDHIEIIEIICGGGGGHGHFGFGHGCGGEDHGFGHGIEGCEHCFHE@;CDDECEEECDECAACCCDCDDDDC@EDEDEEDEDDFFEHHFEEHHE=JIIGIHHEDDGGAD<IIGIJIHFBGDGDIHIHHGHGFIGIGHHHHDFFDDB@@ RG:Z:661825 NM:i:0 MQ:i:60 AS:i:101 XS:i:0

    Post edited by brdido on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    Even if there is something wrong with your reads, the GATK should handle the error gracefully and tell you what the problem is rather than blowing up, so we will look at this more closely. We may need some more info from you to figure this out -- please stand by and we'll get back to you asap.

    Geraldine Van der Auwera, PhD

  • brdidobrdido Posts: 23Member

    Thanks! I'll be glad to help. And i'll be updating the issue if I have any more clues.

  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin
    edited March 2013

    There are several things here that make me nervous.

    First your BAM file is not passing validation. You say you are using bwa7, are you referring to BWA 0.7.3a ? If so, are you using bwa aln, sw or mem? If it's mem, are you using the -M flag? All are good leads as to why your BAM is malformed.

    Second, You say that the validation of the BAM file only incur errors after Indel Realignment? Is it passing validation before indel realignment?

    Third, you should never run BaseRecalibrator with -L unless you really know what you're doing. -L 1:428-249250621 is not nearly enough data to calibrate the error model for base recalibration. When you say 'it works fine' what do you mean? That it doesn't give you errors, or that it produces the right output? Have you looked at the recalibration plots?

    I did not understand this sentence:

    "After the fact that -L works fine i discarded the problem in vcf files and reference file. I don't know how to go further in this investigation since GATK 1 realined.bam also gives me the errors in (2) and those error are peanuts comparing the total number of reads."

    can you clarify?

    Post edited by Carneiro on
  • brdidobrdido Posts: 23Member
    edited March 2013

    First your BAM file is not passing validation. You say you are using bwa7, are you referring to BWA 0.7.3a ? If so, are you using bwa aln, sw or mem? If it's mem, are you using the -M flag? All are good leads as to why your BAM is malformed.

    Yes, i'm referring to BWA 0.7.3a-r367. Using 'bwa mem' and -M flag. It's important to note that i'm using novosort and picard mark duplicate before getting into GATK. I'm reviewing each step right now.

    Second, You say that the validation of the BAM file only incur errors after Indel Realignment? Is it passing validation before indel realignment?

    The errors (2a and 2b) occurs before Indel Realignment (too) BUT the error 2c (3c in the original message, sorry about that) only happens after IndelRealigner.

    Third, you should never run BaseRecalibrator with -L unless you really know what you're doing. -L 1:428-249250621 is not nearly enough data to calibrate the error model for base recalibration. When you say 'it works fine' what do you mean? That it doesn't give you errors, or that it produces the right output? Have you looked at the recalibration plots?

    I just used -L option to surround the error. When i say 'it works fine' i mean i don't get the error.

    I did not understand this sentence:

    "After the fact that -L works fine i discarded the problem in vcf files and reference file. I don't know how to go further in this >investigation since GATK 1 realined.bam also gives me the errors in (2) and those error are peanuts comparing the total number of >reads."

    I mixed up 2 things:

    1 - The parameter -L gave me a hint that reference file and VCF are ok.
    2 - I have the same errors (2a and 3b) from SamFileValidator int the bwa 0.6.2-r126 + GATK-1.6-13 and GATK don't complain about it, but considering the number of reads in the sample, i wouldn't worry about it because there are only few of these erros (2a and 2b).

    It's clearer now?

    Post edited by Geraldine_VdAuwera on
  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin
    edited March 2013

    2a and 2b are intrinsic of your data, but shouldn't matter to the gatk.

    error 2c will only happen if you use BWA MEM because it has splits single read alignments into multiple ones. I have updated the indel realigner to understand those reads, but we would have to check it on your data to see what's going on.

    Geraldine can coordinate with you to send us a snippet of the offending BAM file so we can debug the Indel Realigner to see what's happening here.

    Post edited by Carneiro on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    Hi @brdido,

    As Mauricio said you'll need to upload a snippet of your bam file so we can reproduce the error locally. Please see the detailed instructions here: http://www.broadinstitute.org/gatk/guide/article?id=1894

    Let me know if you need any help to do this.

    Geraldine Van der Auwera, PhD

  • brdidobrdido Posts: 23Member

    Ok! Thanks a lot. I'll prepare the files needed ant get back to you..

  • isaacisaac Posts: 3Member

    Hi,

    I seem to be getting the same error. We have reads mapped with stampy, duplicates removed with picard, indel realignment with GATK. Then I attempted to run BaseReaclibrator

    Args: -T BaseRecalibrator -R ../../pantro3/panTro3.bamorder.fasta -I ../indel_realigner/dennis.realigned.bam -knownSites ../init_calling_per_chrom/pedigree.gatk.raw.vcf -o dennis.recal.grp

    Error:

    INFO 20:44:03,973 ProgressMeter - chr10:130748575 1.03e+08 2.9 h 100.0 s 10.9% 26.3 h 23.5 h INFO 20:45:03,991 ProgressMeter - chr10:132896073 1.03e+08 2.9 h 100.0 s 10.9% 26.3 h 23.5 h INFO 20:46:04,009 ProgressMeter - chr10_GL391380_random:214 1.04e+08 2.9 h 100.0 s 11.0% 26.4 h 23.5 h INFO 20:47:04,029 ProgressMeter - chr10_AACZ03166204_random:694 1.04e+08 2.9 h 100.0 s 11.0% 26.5 h 23.6 h INFO 20:48:04,058 ProgressMeter - chr10_AACZ03166550_random:1274 1.05e+08 2.9 h 100.0 s 11.0% 26.6 h 23.6 h

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (100) > (99) STOP -- this should never happen -- call Mauricio! at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinates(ReadClipper.java:537) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipByReferenceCoordinatesRightTail(ReadClipper.java:193) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:389) at org.broadinstitute.sting.utils.clipping.ReadClipper.hardClipAdaptorSequence(ReadClipper.java:392) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:244) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:131) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:230) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano$TraverseReadsMap.apply(TraverseReadsNano.java:218) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:274) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:102) at org.broadinstitute.sting.gatk.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:56) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:109) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: START (100) > (99) STOP -- this should never happen -- call Mauricio!
    ERROR ------------------------------------------------------------------------------------------

    INFO 20:48:45,045 HelpFormatter - --------------------------------------------------------------------------------

    Let me know if you need any data to debug.

    Isaac

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    Hi Isaac,

    Yes, it would help if you could upload a snippet of your file to our FTP server. Please see the article I linked to above for full instructions.

    Geraldine Van der Auwera, PhD

  • brdidobrdido Posts: 23Member

    Geraldine and Mauricio, i've uploaded the files in your FTP server. RLB/bugReportBrdido4725.tar.gz

    If anything else is needed please let me know. Thanks!

  • brdidobrdido Posts: 23Member

    Thanks @Carneiro, the problem is before IndelRealigner. Cheers.

  • MutagenicMutagenic Posts: 6Member
    edited April 2013

    I wanted to report that I had the same error. I used bwasw. The issue was corrected after including the flag -rf BadCigar as suggeted above.

    bwa version: 0.7.3a

    gatk version: 2.4-9-g532efad

    Post edited by Mutagenic on
  • flescaiflescai Posts: 51Member ✭✭

    Same identical error here.

    org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (100) > (99) STOP -- this should never happen -- call Mauricio!

    I'll try to upload my file tomorrow. Thanks for all your help guys!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    I don't think we need a file for this -- version 2.5 should now catch this issue cleanly. Can you please upgrade your gatk version and run again?

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 51Member ✭✭

    Hi @Geraldine_VdAuwera

    unfortunately it's happening again. I downloaded the source from git.

    the version is version 2.5-2-gf57256b and the error looks very similar (except the call to Mauricio :-P)

    ##### ERROR stack trace 
    org.broadinstitute.sting.utils.exceptions.ReviewedStingException: START (100) > (99) STOP -- this should never happen, please check read: FCD1R7BACXX:2:2101:19788:52999#ATGAACCT 1/2 100b aligned read. (CIGAR: 94M4I2M3D)
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    Ah, it's an issue with the cigar -- the read ends in deletions. This is undesirable -- we've seen this in data output by BWA-mem. More recent versions of BWA should no longer do that. If you don't want to have to realign your data, use the bad cigar filter (add -rf BadCigar to your command).

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 51Member ✭✭

    thanks @Geraldine_VdAuwera, it's strange because I'm using the very latest BWA version, maybe I should report the issue there as well.

    I imagine I can add "-rf BadCigar" as an option in my scala script

    would you be so kind to clarify which is the correspondent for Queue? might be useful for others as well.

    thank you so much!

  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin

    flescai, read filters ("-rf") are engine arguments, therefore available to every walker in queue or in the command line. You can add it to your queue script directly.

  • cobalt137cccobalt137cc Posts: 1Member

    I'm getting the same error from alignments with bwa-0.7.4-r385. Seems to be solved with the -rf BadCigar argument

  • CarneiroCarneiro Posts: 275Administrator, GATK Developer admin

    note that the BadCigar filter doesn't "solve" anything. It just discards the reads with malformed (unsupported) cigars from your data. Is your cigar also with deletions in the ends? If so you should report this issue to BWA.

  • drchriscoledrchriscole Posts: 16Member

    I had this problem when running bwa aln 0.7.3a, but can confirm that updating bwa to 0.7.5a has fixed this for me.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,973Administrator, GATK Developer admin

    Great, thanks for confirming, @drchriscole.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.