Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

BaseRecalibration: Recalibration table is empty

bcantarelbcantarel Baylor HealthPosts: 3Member

When running GATK, I am getting "empty" results when running BaseRecalibrator. I didn't see a solution to this when searching.

java -Xmx4g -jar /seqprg/GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -l INFO -R /Users/bcantarel/projects/refdb/human_g1k_v37.fasta --knownSites /Users/bcantarel/projects/refdb/00-All.vcf -I Sample_cDNA405.bam -T BaseRecalibrator -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o Sample_cDNA405.grp

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalStateException: recalibration tables list is empty
at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.mergeThreadLocalRecalibrationTables(RecalibrationEngine.java:209)
at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.finalizeData(RecalibrationEngine.java:175)
at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:508)
at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:131)
at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:123)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-3-g2a7af43):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: recalibration tables list is empty
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • bcantarelbcantarel Baylor HealthPosts: 3Member

    Hmm, ok, thanks for the hint. Maybe the error message should say that "Low Alignment Rate" or "Too few reads aligning" or something like like... Thanks again!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    OK, I'll see if we can make the error message clearer.

    Geraldine Van der Auwera, PhD

  • bcantarelbcantarel Baylor HealthPosts: 3Member

    On a side note -- the problem is from a bug in BWA (ie produces SAMs without mapped reads) -- so if anyone was unlucky enough to download BWA 0.7.0 -- well it was only up 1 week so, the issues are likely fixed in the new version.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Ah, the aligner taking a vacation would indeed be a problem! Thanks for letting us know what was the real problem -- it's very useful for us to hear about the range of issues people have that can lead to common symptoms like this.

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 51Member ✭✭

    I am running into the same problem, and yes I'm using exome data here.

    I was using a previous version of BWA (0.6.2) and updated to the most recent one (0.7.3a) but I still have the same problem.

    It happens in intervals like
    GL000200.1 1 187035 + interval_80

    the portion of the code that gives this error is actually

    'java' '-Xmx4096m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/me/tests/gatk/.queue/tmp' '-cp' '/home/me/tools/gatk-protected/dist/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'BaseRecalibrator' '-I' '/home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam' '-L' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/scatter.intervals' '-R' '/home/me/resources/GATKbundle/2.3/b37/human_g1k_v37.fasta' '-DIQ' '-knownSites' '/home/me/resources/GATKbundle/2.3/b37/dbsnp_137.b37.vcf' '-o' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/test04.FOSZAW_F2_1.fq.gz.pre_recal.table' '-cov' 'ReadGroupCovariate' '-cov' 'QualityScoreCovariate' '-cov' 'CycleCovariate' '-cov' 'ContextCovariate' '-dP' 'Illumina'

    I therefore checked what's in the bam file and this is the output

    samtools view /home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam GL000200.1:1-187035
    FCD03KHACXX:7:1101:5083:90023#GTTGCAAC 163 GL000200.1 20411 0 90M = 20588 267 CATAGGAAATAGTTACCAAGAAATGCAGCAGCTAAACTTGGAAGGAAAGAACTATTGCACAGCCAAAACATTGTACATATCTGATTTAGA GGGEDFBDFDFDEFDI;GGGBGEG=D;@;DGBGGEFHBFFBDBF8D:AD?:<>=AGBGEGC8@DDE=EEE?BD<DEGD=DB@EBAD<DDC X0:i:5 X1:i:0 MD:Z:13G76 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:1 SM:i:0 XM:i:1 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:5083:90023#GTTGCAAC 83 GL000200.1 20588 0 90M = 20411 -267 TTCCAAAAAGAAGCAGTCATTGAAAAATGCTGACTTATGCATTGCCTCAGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGAC ?E@FCFEFEEB@EDEFD?BFEEFGGFFCDFFBAECDFD>EFEGGGBCAB7DGBGGFFCFCGGFFEGEGGGFFFFBDEE6EFFF@FGGGFG X0:i:4 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 163 GL000200.1 20636 0 90M = 20755 209 AGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGACAGTTAGTACCAGATACTTGCACGTAGAAAGAGGTAATTTTCATGCTAG HHHHHHHHHGHFHHHHHHGHHHHHHHHHHHEHHHHHHHHHBEGHDHHHFHHHHHFHCHHGHFHHHFGGFGFGG@FCE5=EGFGEIHHFHE X0:i:4 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 83 GL000200.1 20755 0 90M = 20636 -209 TTCTTGGATGATGATGGATCAGAAGGAGAAGAATTCACAGTCTGAGATGGCTACATTCATTATGGACAAACAGTCAAACTTGTGTGCTCA H4HHHFHHHHHHHHHHHHHFHHHFHHGHFEHFHFFHHHHHHHHHHHHGHHHFHHHEBHHFHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH X0:i:5 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:9920:195302#GTTGCAAC 65 GL000200.1 29250 0 90M 6 49937260 0 AGACCAGAATTGCACCCATCAAATGCCTCACTCACCATATGTCAGCCCAGAAGACTCTTGCAGTGGTGAGCCAGTCTCTTTATCCACCAA HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHFHHHHHHCHHHHHFFHHHHHHFHHHHHHFHFEFHHEHHHEFFE;GIFFFHBEHHBFHHD X0:i:3 X1:i:3 XA:Z:9,+44481078,90M,0;9,+43322151,90M,0;9,-41877008,90M,1;9,+46921634,90M,1;9,+65982263,90M,1; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 XT:A:R
    FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 99 GL000200.1 64231 0 90M = 64292 151 CACTACGCCCGGCTAATTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCGCTTTAGCCGGGATGGTCTCGATCTCCTGACCTCGTGATC EEEEEECDCECEEE@EDDDD96,60@@A><CA5DD2.)7/+;-(+,8<86=/=?=?)+:+*'1-9>197B<@3A/@@@8;=@######## X0:i:1 X1:i:1377 MD:Z:51T38 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:1 SM:i:0 XM:i:1 XO:i:0 MQ:i:0 XT:A:U
    FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 147 GL000200.1 64292 0 90M = 64231 -151 GATGGTCGCGATCGCCTGACCTCGTGATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGTCACCGCGCCCGGCCGAG #################################@ACDGDDB4A?DE?6:GA@@B;BFG;GCGCGGEB@E9B;;B7E=FF?GGEDEGBEGG X0:i:72 MD:Z:7T5T58C17 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:3 SM:i:0 XM:i:3 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 99 GL000200.1 72388 0 90M = 72561 263 TAAGTTGTAATGTTTAATTCTTTGAATGTTTCAGTGGGAGCTAGAAATTGGTTTGATATACTTTTTAGTTCAGTTGGAATACTTAACACT HHHHHHHFHHHGHHHHGGHHGHHFGFHGHHHHFGHHHHHHHFFG?HFHFHHEEHHFFDFHEHFGHFDEEEEFFBFGHFHHHHHFGHHHH= X0:i:3 X1:i:0 XA:Z:9,+43365464,90M,0;9,+44524392,90M,0; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 147 GL000200.1 72561 0 90M = 72388 -263 AAAGAATTGAAAAAAAAAGTGACACAAATTGATATATCACGCAAACTATGTGGTTTTGTATTTTCAACTAATTGCTGAAGAGCACTTATA HGBHGGFGIGEHHGHHHGHHGHHFHHHHHHGHFHHFHHHHBHHHHHHHHHHHGHHGHHHHHHHFHGFHGHHHHEHHHHHHHHHHHHHHHH X0:i:3 X1:i:0 XA:Z:9,-43365637,90M,0;9,-44524565,90M,0; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:3111:80943#GTTGCAAC 163 GL000200.1 107292 0 90M = 107352 150 TTGCTATTGACACAATCATTAACCAGAAATGTTTCAATGATGGATCTGATGAAAAGAAGAAGCTGTACTGTGTCTATGTTGTTATTGGTC HEHHHHHHHHHGHHHGHHHHHHHHHHHHHHHCHHGHHHHHHHHHHHHHHHHHHHHHGHHGHHHHHHGHHHDFDGGFFFFBGEGFGBDEEE X0:i:8 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:3111:80943#GTTGCAAC 83 GL000200.1 107352 0 90M = 107292 -150 AGCTGTACTGTGTCTATGTTGTTATTGGTCAAAAGAGATCCACTGTTGCCCAGTTGGTGAAGAGACTTACGATGCAGATGCCATGAATTA FFGAGHDHHHHHHFHFHHHHHFHHECFHHEHHHHHHHHHHEBEGFHH@DEHHHHHHHHHGHHHHBHHHHHHFFHHHHHHHHHHHHHHHHH X0:i:8 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:6010:24148#GTTGCAAC 99 GL000200.1 107805 0 90M = 107947 232 ATCTTCTTGGAAACAGAATTGTTCTACAAAGGTATCCACCCTGCCATTAATGTCGGTCTGTCTGTGTCTCGTGTCAGATCTGCTGCCCAA HHHGHGHHFHBHHHHHEHHHHHHHCDFFGDE?CFDFDFDFEEFHHHH@HHDG;AFA6D>C9A?<<44<9>D?B>DFDD6F?4EDEECFEC X0:i:6 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1101:6010:24148#GTTGCAAC 147 GL000200.1 107947 0 90M = 107805 -232 ATCATGAGGTCACCACTTTTGCCCAGTTCAGTTCTGACCTCGATGCTGCCACTCAACAACTTTTGAGTTGTGGTGTGTGTCTAACTGAGT DFFG8GEGGF@@@@<6/8DDHDHEGIGGGG@EEEEF<FBFHHBGHHGHGEFBGGDGFGEGHHHEHHHGCHGHHFHHHHHHHFGFGFHHHF X0:i:7 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 99 GL000200.1 134050 0 90M = 134258 298 TTGATGACCTCCCCTTTTCCCAGGTCAAAGGAGAATTTGTCCTTGCGATCCACACTGGAGTCAAACTTTGTGCCCTCTAACAGCCAGCCA HHGHHHHHGHHHHGHHHHGHHGHHHHGHHHGHHHEHHHHHHHHHHHHFBHHFFHFEGEFGHFHHEHBDEFCEGEEHH@GHGHHFFCHHHE X0:i:4 X1:i:4 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 147 GL000200.1 134258 0 90M = 134050 -298 GCAGCGGCACCGGCTGCGCCGCACTCTCGGTCGCCTTCATCTCCTTGGCTGTCATCTCTGCGTGGCGCGAAATTTTTCCGGGAGATGGCG ;38<4F?FCFD7DD=DCA5D<C<<@E?GHFHHGHGGFGGGG;GGFGDDGAGFGIEECEECGFGGCG>HHHHHGEHFHHHGHGHHHHHHHH X0:i:4 X1:i:3 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1102:6709:6487#GTTGCAAC 99 GL000200.1 173813 0 90M = 173961 238 TGGCACCCTGCAAATAAACACCTCTTTTCTCCTGCTGCAAACCTTGGTGTGGGTGTTTGGCCTGACTGCGCTGGGCAGGCAGACCCAGCT FGGGE?GGBGGGGGGGGGGGGCGFGGGGGGGGGGFFCGGBGFGGGGGEGF>FFCFGFFGFEEGGEFGG?BFDDD3EE@6EEFFFEGDFGA X0:i:6 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R
    FCD03KHACXX:7:1102:6709:6487#GTTGCAAC 147 GL000200.1 173961 0 90M = 173813 -238 TATAAATTCCAGGCTGGGCAGAGTGGCTCACACCTGTAATCCTAGCACTTTGGGAGGCCGAAGCTGGTGGATCACCTGAGGTCAGGAAGT @DA8DEEEG=DCGGGDDC88FGGFBEGGGHG:GGGEEDDBFFGGGBAGGG7HHGHHFHGFGEG@FGGGDEHHFEEHHGHHHDDFHHHHHH X0:i:6 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R

    any chance to avoid this error?
    maybe avoiding processing extra-chromosomal regions?

    thanks for any help you might offer!

    Francesco

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Hi Francesco,

    Are you passing the intervals list of capture targets? If you're working with exome data that is recommended. And if you find that this problem specifically occurs with non-chromosome contigs, then you just don't include those contigs in your intervals list.

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 51Member ✭✭

    Thanks Geraldine, that actually solved the problem :-)
    I was only passing the target intervals during the calling process, and not during the data processing pipeline.

    It completely make sense, although I might loose just a bit of information in the recalibrated bam file.
    thanks for your help!

  • MinQiaoMinQiao Posts: 3Member

    Geraldine, I am running into a similar issue, only that the recalibrator seems to work with proper output, but my table always remains zero byte. My command is

    java -Xmx5000m -jar GenomeAnalysisTK.jar -T BaseRecalibrator -I LS1.clean.dedup.bam -R ucsc.hg19.fasta -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -knownSites dbsnp_137.hg19.vcf -knownSites hapmap_3.3.hg19.vcf -knownSites 1000G_omni2.5.hg19.vcf -knownSites Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites 1000G_phase1.indels.hg19.vcf -o LS1.pre_recal.table

    and the screen output seems right (below). However nothing was written to file "LS1.pre_recal.table" nor error message was reported. What could possibly go wrong with my alignment? I used bwa version 0.6.2 and I've already completed indel realignment and mark duplicate before I got here. Please help!

    `Date/Time: 2013/05/08 17:57:32
    INFO 17:57:32,951 HelpFormatter - --------------------------------------------------------------------------------
    INFO 17:57:32,951 HelpFormatter - --------------------------------------------------------------------------------
    INFO 17:57:32,960 ArgumentTypeDescriptor - Dynamically determined type of dbsnp_137.hg19.vcf to be VCF
    INFO 17:57:32,961 ArgumentTypeDescriptor - Dynamically determined type of hapmap_3.3.hg19.vcf to be VCF
    INFO 17:57:32,962 ArgumentTypeDescriptor - Dynamically determined type of 1000G_omni2.5.hg19.vcf to be VCF
    INFO 17:57:32,963 ArgumentTypeDescriptor - Dynamically determined type of Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 17:57:32,964 ArgumentTypeDescriptor - Dynamically determined type of 1000G_phase1.indels.hg19.vcf to be VCF
    INFO 17:57:32,997 GenomeAnalysisEngine - Strictness is SILENT
    INFO 17:57:33,061 GenomeAnalysisEngine - Downsampling Settings: No downsampling
    INFO 17:57:33,065 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 17:57:33,076 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01
    INFO 17:57:33,084 RMDTrackBuilder - Loading Tribble index from disk for file dbsnp_137.hg19.vcf
    INFO 17:57:33,171 RMDTrackBuilder - Loading Tribble index from disk for file hapmap_3.3.hg19.vcf
    INFO 17:57:33,190 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_omni2.5.hg19.vcf
    INFO 17:57:33,204 RMDTrackBuilder - Loading Tribble index from disk for file Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 17:57:33,222 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_phase1.indels.hg19.vcf
    INFO 17:57:33,271 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
    INFO 17:57:33,274 GenomeAnalysisEngine - Done creating shard strategy
    INFO 17:57:33,274 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 17:57:33,274 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining
    INFO 17:57:33,328 BaseRecalibrator - The covariates being used here:

    INFO 17:57:33,328 BaseRecalibrator - ReadGroupCovariate
    INFO 17:57:33,328 BaseRecalibrator - QualityScoreCovariate
    INFO 17:57:33,328 BaseRecalibrator - ContextCovariate
    INFO 17:57:33,328 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
    INFO 17:57:33,329 BaseRecalibrator - CycleCovariate
    INFO 17:57:33,330 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO 17:57:33,331 ReadShardBalancer$1 - Done loading BAM index data for next contig
    INFO 17:58:03,384 ProgressMeter - chr1:1457505 2.86e+05 30.0 s 105.0 s 0.0% 17.7 h 17.7 h
    INFO 17:59:03,468 ProgressMeter - chr1:5271635 1.19e+06 90.0 s 76.0 s 0.2% 14.8 h 14.8 h
    INFO 18:00:03,469 ProgressMeter - chr1:8794195 2.19e+06 2.5 m 68.0 s 0.3% 14.8 h 14.8 h
    INFO 18:01:04,022 ProgressMeter - chr1:12483444 3.09e+06 3.5 m 68.0 s 0.4% 14.6 h 14.6 h
    INFO 18:02:04,031 ProgressMeter - chr1:16261146 3.89e+06 4.5 m 69.0 s 0.5% 14.5 h 14.4 h
    ....... .........

    `

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Hi @MinQiao,

    What does the output say at the end of the run? Maybe a lot of reads are getting filtered out.

    Geraldine Van der Auwera, PhD

  • MinQiaoMinQiao Posts: 3Member

    @Geraldine_VdAuwera

    Thank you so much for helping me out. I cannot reach the cluster for now but I am sure the last line of output were similar to the rest of them, just a progress report. No warning or error message. By what criteria will some reads (in my case seems all reads) be filtered out? Is it an indication of very bad quality of my data or completely wrong mapping? Thank you!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    There are many different things that can cause reads to be filtered out, e.g. if they are badly mapped or badly formatted. At the end of a run the GATK always prints out a summary of the filtering results; it says how many reads were seen in total, how many filtered out, and the breakdown of numbers between the different reasons why they were filtered. That can be very helpful to understand what quality issues may affect your data.

    Geraldine Van der Auwera, PhD

  • MinQiaoMinQiao Posts: 3Member

    Thank you @Geraldine_VdAuwera, for the information of expected output! I just double-checked mine, which on the last line instead of the summary it has progress meter at chromosome 22. I will find another computer to give it a go. Sorry for having asked a probably dumb question

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Ah, it sounds like your run failed without exiting cleanly. if you try again hopefully it will complete normally.

    Don't worry, there are no dumb questions.

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    When running GATK tools I am getting the same error and I have read through forum but could not find a solution to my problem.

    This is my script:

    java -Xmx4g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp

    This is my error as well:
    INFO 11:41:10,566 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalStateException: recalibration tables list is empty
    at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.mergeThreadLocalRecalibrationTables(RecalibrationEngine.java:209)
    at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.finalizeData(RecalibrationEngine.java:175)
    at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:508)
    at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:131)
    at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:123)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: recalibration tables list is empty
    ERROR ------------------------------------------------------------------------------------------

    I have used RealignerTargetCreator and IndelRealigner without any issues have gotten the correct output I need. But for some reason at the BaseRecalibrator step I am getting this error. If someone could please help me troubleshoot this.

    Thanks,
    Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Hi @Sinan, can you please try again with the latest version of the gatk?

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member
  • sir2013sir2013 Posts: 17Member
    edited May 2013

    Hello again, for some reason when running BaseRecalibration I am getting zero processed reads which is quite interesting. Do you have any idea as why this is occuring, also I do get an output with zero recalibration information.

    Here is my command:
    java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf --validation_strictness STRICT -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp


    running screen:
    INFO 15:10:40,075 HelpFormatter - --------------------------------------------------------------------------------
    INFO 15:10:40,077 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
    INFO 15:10:40,077 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 15:10:40,077 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 15:10:40,082 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp
    INFO 15:10:40,082 HelpFormatter - Date/Time: 2013/05/14 15:10:40
    INFO 15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
    INFO 15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
    INFO 15:10:40,104 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF
    INFO 15:10:40,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 15:10:40,127 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF
    INFO 15:10:42,750 GenomeAnalysisEngine - Strictness is SILENT
    INFO 15:10:43,025 GenomeAnalysisEngine - Downsampling Settings: No downsampling
    INFO 15:10:43,033 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 15:10:43,051 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
    INFO 15:10:43,093 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf
    INFO 15:10:43,407 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 15:10:44,262 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf
    INFO 15:10:44,588 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
    INFO 15:10:44,599 GenomeAnalysisEngine - Done creating shard strategy
    INFO 15:10:44,600 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 15:10:44,600 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining
    INFO 15:10:44,730 BaseRecalibrator - The covariates being used here:
    INFO 15:10:44,731 BaseRecalibrator - ReadGroupCovariate
    INFO 15:10:44,731 BaseRecalibrator - QualityScoreCovariate
    INFO 15:10:44,732 BaseRecalibrator - ContextCovariate
    INFO 15:10:44,732 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
    INFO 15:10:44,733 BaseRecalibrator - CycleCovariate
    INFO 15:10:44,738 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO 15:10:44,741 ReadShardBalancer$1 - Done loading BAM index data for next contig
    INFO 15:11:14,658 ProgressMeter - Starting 0.00e+00 30.0 s 49.7 w 100.0% 30.0 s 0.0 s
    INFO 15:11:44,662 ProgressMeter - Starting 0.00e+00 60.0 s 99.3 w 100.0% 60.0 s 0.0 s
    INFO 15:12:15,185 ProgressMeter - Starting 0.00e+00 90.0 s 149.8 w 100.0% 90.0 s 0.0 s
    INFO 15:12:45,186 ProgressMeter - Starting 0.00e+00 120.0 s 199.4 w 100.0% 120.0 s 0.0 s
    INFO 15:13:15,189 ProgressMeter - Starting 0.00e+00 2.5 m 249.0 w 100.0% 2.5 m 0.0 s
    INFO 15:13:45,191 ProgressMeter - Starting 0.00e+00 3.0 m 298.6 w 100.0% 3.0 m 0.0 s
    INFO 15:14:15,193 ProgressMeter - Starting 0.00e+00 3.5 m 348.2 w 100.0% 3.5 m 0.0 s
    INFO 15:14:44,687 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO 15:14:44,692 BaseRecalibrator - Calculating quantized quality scores...
    INFO 15:14:45,195 ProgressMeter - Starting 0.00e+00 4.0 m 397.8 w 100.0% 4.0 m 0.0 s
    INFO 15:14:45,576 BaseRecalibrator - Writing recalibration report...
    INFO 15:14:46,197 BaseRecalibrator - ...done!
    **INFO 15:14:46,200 BaseRecalibrator - Processed: 0 reads**
    INFO 15:14:46,209 ProgressMeter - done 0.00e+00 4.0 m 399.5 w 100.0% 4.0 m 0.0 s
    INFO 15:14:46,216 ProgressMeter - Total runtime 241.62 secs, 4.03 min, 0.07 hours
    INFO 15:14:47,683 GATKRunReport - Uploaded run statistics report to AWS S3

    and output file information in it:

    :GATKReport.v1.1:5

    :GATKTable:2:18:%s:%s:;

    :GATKTable:Arguments:Recalibration argument collection values used in this run

    Argument Value
    binary_tag_name null
    covariate ReadGroupCovariate,QualityScoreCovariate,ContextCovariate,CycleCovariate
    default_platform null
    deletions_default_quality 45
    force_platform null
    indels_context_size 3
    insertions_default_quality 45
    low_quality_tail 2
    maximum_cycle_value 500
    mismatches_context_size 2
    mismatches_default_quality -1
    no_standard_covs false
    plot_pdf_file null
    quantizing_levels 16
    recalibration_report null
    run_without_dbsnp false
    solid_nocall_strategy THROW_EXCEPTION
    solid_recal_mode SET_Q_ZERO

    :GATKTable:3:94:%s:%s:%s:;

    :GATKTable:Quantized:Quality quantization map

    QualityScore Count QuantizedScore
    0 0 93
    1 0 93
    2 0 93
    3 0 93
    4 0 93
    5 0 93
    6 0 93
    7 0 93
    8 0 93
    9 0 93
    10 0 93
    11 0 93
    12 0 93
    13 0 93
    14 0 93
    15 0 93
    16 0 93
    17 0 93
    18 0 93
    19 0 93
    20 0 93
    21 0 93
    22 0 93
    23 0 93
    24 0 93
    25 0 93
    26 0 93
    27 0 93
    28 0 93
    29 0 93
    30 0 93
    31 0 93
    32 0 93
    33 0 93
    34 0 93
    35 0 93
    36 0 93
    37 0 93
    38 0 93
    39 0 93
    40 0 93
    41 0 93
    42 0 93
    43 0 93
    44 0 93
    45 0 93
    46 0 93
    47 0 93
    48 0 93
    49 0 93
    50 0 93
    51 0 93
    52 0 93
    53 0 93
    54 0 93
    55 0 93
    56 0 93
    57 0 93
    58 0 93
    59 0 93
    60 0 93
    61 0 93
    62 0 93
    63 0 93
    64 0 93
    65 0 93
    66 0 93
    67 0 93
    68 0 93
    69 0 93
    70 0 93
    71 0 93
    72 0 93
    73 0 93
    74 0 93
    75 0 93
    76 0 93
    77 0 93
    78 0 93
    79 0 79
    80 0 80
    81 0 81
    82 0 82
    83 0 83
    84 0 84
    85 0 85
    86 0 86
    87 0 87
    88 0 88
    89 0 89
    90 0 90
    91 0 91
    92 0 92
    93 0 93

    :GATKTable:6:0:%s:%s:%.4f:%.4f:%d:%.2f:;

    :GATKTable:RecalTable0:

    ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors

    :GATKTable:6:0:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable1:

    ReadGroup QualityScore EventType EmpiricalQuality Observations Errors

    :GATKTable:8:0:%s:%s:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable2:

    ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors

    I do apologize for the long post in the forum. I just dont understand why no Errors are being given as well and no recalibration is being processed.

    Thanks,
    Sinan

    Post edited by Mark_DePristo on
  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    can you check if your BAM file has any reads? Sounds silly but it could be something as simple as that.

    Also you don't need to specify the -cov parameters. Those are the default covariates and if you specify them like that, I am afraid it may be confusing the tool. Can you remove those parameters and check if it works? (I'll issue a bug report if that's the case)

  • sir2013sir2013 Posts: 17Member

    I know the bam file is not empty because for the IndelRealigner process I had to have the quality scores fixed by using
    -fixMisencodedQuals and I cross checked it with original bam file to see if the scores were actually adjusted accordingly. Unfortunately I got the same output

    Command Line Code:

    java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp

    Running Script Output:

    INFO 11:20:21,953 HelpFormatter - --------------------------------------------------------------------------------
    INFO 11:20:21,964 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
    INFO 11:20:21,965 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 11:20:21,965 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 11:20:21,978 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp
    INFO 11:20:21,979 HelpFormatter - Date/Time: 2013/05/16 11:20:21
    INFO 11:20:21,980 HelpFormatter - --------------------------------------------------------------------------------
    INFO 11:20:21,980 HelpFormatter - --------------------------------------------------------------------------------
    INFO 11:20:22,072 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF
    INFO 11:20:22,090 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO 11:20:22,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF
    INFO 11:20:23,555 GenomeAnalysisEngine - Strictness is SILENT
    INFO 11:20:23,845 GenomeAnalysisEngine - Downsampling Settings: No downsampling
    INFO 11:20:23,852 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 11:20:23,899 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04
    INFO 11:20:23,947 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf
    INFO 11:20:24,363 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO 11:20:24,653 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf
    INFO 11:20:26,702 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
    INFO 11:20:26,721 GenomeAnalysisEngine - Done creating shard strategy
    INFO 11:20:26,722 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 11:20:26,723 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining
    INFO 11:20:27,050 BaseRecalibrator - The covariates being used here:
    INFO 11:20:27,051 BaseRecalibrator - ReadGroupCovariate
    INFO 11:20:27,052 BaseRecalibrator - QualityScoreCovariate
    INFO 11:20:27,052 BaseRecalibrator - ContextCovariate
    INFO 11:20:27,053 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
    INFO 11:20:27,054 BaseRecalibrator - CycleCovariate
    INFO 11:20:27,064 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO 11:20:27,068 ReadShardBalancer$1 - Done loading BAM index data for next contig
    INFO 11:20:56,840 ProgressMeter - Starting 0.00e+00 30.0 s 49.8 w 100.0% 30.0 s 0.0 s
    INFO 11:21:26,842 ProgressMeter - Starting 0.00e+00 60.0 s 99.4 w 100.0% 60.0 s 0.0 s
    INFO 11:21:56,844 ProgressMeter - Starting 0.00e+00 90.0 s 149.0 w 100.0% 90.0 s 0.0 s
    INFO 11:22:26,846 ProgressMeter - Starting 0.00e+00 120.0 s 198.6 w 100.0% 120.0 s 0.0 s
    INFO 11:22:49,994 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO 11:22:49,997 BaseRecalibrator - Calculating quantized quality scores...
    INFO 11:22:50,101 BaseRecalibrator - Writing recalibration report...
    INFO 11:22:50,151 BaseRecalibrator - ...done!
    INFO 11:22:50,151 BaseRecalibrator - Processed: 0 reads
    INFO 11:22:50,153 ProgressMeter - done 0.00e+00 2.4 m 237.2 w 100.0% 2.4 m 0.0 s
    INFO 11:22:50,154 ProgressMeter - Total runtime 143.43 secs, 2.39 min, 0.04 hours
    INFO 11:22:51,209 GATKRunReport - Uploaded run statistics report to AWS S3

    Information in output file recal_data.grp:

    :GATKReport.v1.1:5

    :GATKTable:2:18:%s:%s:;

    :GATKTable:Arguments:Recalibration argument collection values used in this run

    Argument Value
    binary_tag_name null
    covariate ReadGroupCovariate,QualityScoreCovariate,ContextCovariate,CycleCovariate
    default_platform null
    deletions_default_quality 45
    force_platform null
    indels_context_size 3
    insertions_default_quality 45
    low_quality_tail 2
    maximum_cycle_value 500
    mismatches_context_size 2
    mismatches_default_quality -1
    no_standard_covs false
    plot_pdf_file null
    quantizing_levels 16
    recalibration_report null
    run_without_dbsnp false
    solid_nocall_strategy THROW_EXCEPTION
    solid_recal_mode SET_Q_ZERO

    :GATKTable:3:94:%s:%s:%s:;

    :GATKTable:Quantized:Quality quantization map

    QualityScore Count QuantizedScore
    0 0 93
    1 0 93
    2 0 93
    3 0 93
    4 0 93
    5 0 93
    6 0 93
    7 0 93
    8 0 93
    9 0 93
    10 0 93
    11 0 93
    12 0 93
    13 0 93
    14 0 93
    15 0 93
    16 0 93
    17 0 93
    18 0 93
    19 0 93
    20 0 93
    21 0 93
    22 0 93
    23 0 93
    24 0 93
    25 0 93
    26 0 93
    27 0 93
    28 0 93
    29 0 93
    30 0 93
    31 0 93
    32 0 93
    33 0 93
    34 0 93
    35 0 93
    36 0 93
    37 0 93
    38 0 93
    39 0 93
    40 0 93
    41 0 93
    42 0 93
    43 0 93
    44 0 93
    45 0 93
    46 0 93
    47 0 93
    48 0 93
    49 0 93
    50 0 93
    51 0 93
    52 0 93
    53 0 93
    54 0 93
    55 0 93
    56 0 93
    57 0 93
    58 0 93
    59 0 93
    60 0 93
    61 0 93
    62 0 93
    63 0 93
    64 0 93
    65 0 93
    66 0 93
    67 0 93
    68 0 93
    69 0 93
    70 0 93
    71 0 93
    72 0 93
    73 0 93
    74 0 93
    75 0 93
    76 0 93
    77 0 93
    78 0 93
    79 0 79
    80 0 80
    81 0 81
    82 0 82
    83 0 83
    84 0 84
    85 0 85
    86 0 86
    87 0 87
    88 0 88
    89 0 89
    90 0 90
    91 0 91
    92 0 92
    93 0 93

    :GATKTable:6:0:%s:%s:%.4f:%.4f:%d:%.2f:;

    :GATKTable:RecalTable0:

    ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors

    :GATKTable:6:0:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable1:

    ReadGroup QualityScore EventType EmpiricalQuality Observations Errors

    :GATKTable:8:0:%s:%s:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable2:

    ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    this is very strange. How big is your BAM file? Can you share it for us to debug this ?

  • sir2013sir2013 Posts: 17Member

    Sure I can share the bam file. Question is, how would I do that? I have used filezilla to download the bundle pack you have. Is there a specific folder I should put in there and how would you like it name to distinguish it?

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin
    edited May 2013

    You can upload it to our FTP server. Instructions are here. Just let me know when you have done so and we will start debugging it internally.

    Thank you very much.

    Post edited by Carneiro on
  • sir2013sir2013 Posts: 17Member
  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    If you can reproduce the error with a tiny version of your BAM file (which you can create with PrintReads using -L ) then you can just attach your file to this thread, which is optimal.

  • sir2013sir2013 Posts: 17Member

    I am sorry I have not gotten to the printreads step yet when you say use -L is there an input for that argument? if you give me an example so I can attach the file

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    nevermind, just upload the whole file. 2.1G is fairly small.

  • sir2013sir2013 Posts: 17Member

    Ok, it seems to be taking forever for the uploading it has been saying "uploading" for the past 4 hours. Is there another way to get this to you.

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    If you have any place to put it, we can download it from our end. But the FTP is the preferred method.

  • sir2013sir2013 Posts: 17Member

    I created a folder under my name "Sinan" and I uploaded on the FTP for uploads. There you will see 1024_D_realigned.bam, this bam file has already successfully gone through the RealTargetCreator and IndelRealigner. I do hope to hear some good news because I tired running other bam files which were unsuccessful 0 reads processed again.

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin
  • sir2013sir2013 Posts: 17Member

    Hello, I was wondering if there was any update or if a solution has been found to my problem.

    Thanks,
    Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    It seems like your BAM file has MQ 255 reads, that's why they're all being filtered out.

  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Developer admin

    Yes, the newest GATK will print a more informative message on this problem. It will also be possible to fix by adding -rf ReassignMappingQuality to the command line. Note this will only work in the nightly build and will come out with GATK 2.6

    --
    Mark A. DePristo, Ph.D.
    Co-Director, Medical and Population Genetics
    Broad Institute of MIT and Harvard

  • sir2013sir2013 Posts: 17Member

    Should this have been fixed when I specify -fixMisencodedQuals while doing the IndelRealigner? I checked the output of the new bam to the old bam and I could see the adjustments had been made

    Thanks,
    Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Hi Sinan,

    -fixMisencodedQuals is meant to fix a different issue which concerns base qualities, not mapping qualities (see release highlights for 2.3 for more details). Are you still having problems?

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    Hello,

    I thought I would bring this to your attention regarding the MQ255. I use Star to run my alignments and just as tophat, star has the same MQ annotation.

    255 = uniquely mapped
    3 = maps to 2 locations
    2 = maps to 3 locations
    1 = maps to 4-9 locations
    0 = 10 or more locations

    So as you can see there is no score actually being assigned for MQ but bwa does give an actually scoring. I was wondering if there is conversion for all 5 scores other then 255 being converted to 60. So that I can proper processes my data through GATK tools

    Thanks,
    Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,832Administrator, GATK Developer admin

    Hi Sinan,

    The GATK will only consider uniquely mapped reads, so converting the MQ 255 values is the only step necessary. The other reads will be ignored.

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    Ok, thank you very much for all the help I have finally got the BaseRecalibration step to work cheers!

    Sinan

Sign In or Register to comment.