BaseRecalibration: Recalibration table is empty

bcantarelbcantarel Baylor HealthPosts: 3Member

When running GATK, I am getting "empty" results when running BaseRecalibrator. I didn't see a solution to this when searching.

java -Xmx4g -jar /seqprg/GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -l INFO -R /Users/bcantarel/projects/refdb/human_g1k_v37.fasta --knownSites /Users/bcantarel/projects/refdb/00-All.vcf -I Sample_cDNA405.bam -T BaseRecalibrator -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o Sample_cDNA405.grp

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.IllegalStateException: recalibration tables list is empty at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.mergeThreadLocalRecalibrationTables(RecalibrationEngine.java:209) at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.finalizeData(RecalibrationEngine.java:175) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:508) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:131) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:123) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.4-3-g2a7af43):
ERROR
ERROR Please visit the wiki to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: recalibration tables list is empty
ERROR ------------------------------------------------------------------------------------------

Best Answer

Answers

  • bcantarelbcantarel Baylor HealthPosts: 3Member

    Hmm, ok, thanks for the hint. Maybe the error message should say that "Low Alignment Rate" or "Too few reads aligning" or something like like... Thanks again!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    OK, I'll see if we can make the error message clearer.

    Geraldine Van der Auwera, PhD

  • bcantarelbcantarel Baylor HealthPosts: 3Member

    On a side note -- the problem is from a bug in BWA (ie produces SAMs without mapped reads) -- so if anyone was unlucky enough to download BWA 0.7.0 -- well it was only up 1 week so, the issues are likely fixed in the new version.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Ah, the aligner taking a vacation would indeed be a problem! Thanks for letting us know what was the real problem -- it's very useful for us to hear about the range of issues people have that can lead to common symptoms like this.

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 53Member ✭✭

    I am running into the same problem, and yes I'm using exome data here.

    I was using a previous version of BWA (0.6.2) and updated to the most recent one (0.7.3a) but I still have the same problem.

    It happens in intervals like GL000200.1 1 187035 + interval_80

    the portion of the code that gives this error is actually

    'java' '-Xmx4096m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/me/tests/gatk/.queue/tmp' '-cp' '/home/me/tools/gatk-protected/dist/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'BaseRecalibrator' '-I' '/home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam' '-L' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/scatter.intervals' '-R' '/home/me/resources/GATKbundle/2.3/b37/human_g1k_v37.fasta' '-DIQ' '-knownSites' '/home/me/resources/GATKbundle/2.3/b37/dbsnp_137.b37.vcf' '-o' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/test04.FOSZAW_F2_1.fq.gz.pre_recal.table' '-cov' 'ReadGroupCovariate' '-cov' 'QualityScoreCovariate' '-cov' 'CycleCovariate' '-cov' 'ContextCovariate' '-dP' 'Illumina'

    I therefore checked what's in the bam file and this is the output

    samtools view /home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam GL000200.1:1-187035
    FCD03KHACXX:7:1101:5083:90023#GTTGCAAC  163 GL000200.1  20411   0   90M =   20588   267 CATAGGAAATAGTTACCAAGAAATGCAGCAGCTAAACTTGGAAGGAAAGAACTATTGCACAGCCAAAACATTGTACATATCTGATTTAGA  GGGEDFBDFDFDEFDI;GGGBGEG=D;@;DGBGGEFHBFFBDBF8D:AD?:<>=AGBGEGC8@DDE=EEE?BD<DEGD=DB@EBAD<DDC  X0:i:5  X1:i:0  MD:Z:13G76  PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:1  SM:i:0  XM:i:1  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:5083:90023#GTTGCAAC  83  GL000200.1  20588   0   90M =   20411   -267    TTCCAAAAAGAAGCAGTCATTGAAAAATGCTGACTTATGCATTGCCTCAGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGAC  ?E@FCFEFEEB@EDEFD?BFEEFGGFFCDFFBAECDFD>EFEGGGBCAB7DGBGGFFCFCGGFFEGEGGGFFFFBDEE6EFFF@FGGGFG  X0:i:4  X1:i:1  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 163 GL000200.1  20636   0   90M =   20755   209 AGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGACAGTTAGTACCAGATACTTGCACGTAGAAAGAGGTAATTTTCATGCTAG  HHHHHHHHHGHFHHHHHHGHHHHHHHHHHHEHHHHHHHHHBEGHDHHHFHHHHHFHCHHGHFHHHFGGFGFGG@FCE5=EGFGEIHHFHE  X0:i:4  X1:i:1  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 83  GL000200.1  20755   0   90M =   20636   -209    TTCTTGGATGATGATGGATCAGAAGGAGAAGAATTCACAGTCTGAGATGGCTACATTCATTATGGACAAACAGTCAAACTTGTGTGCTCA  H4HHHFHHHHHHHHHHHHHFHHHFHHGHFEHFHFFHHHHHHHHHHHHGHHHFHHHEBHHFHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH  X0:i:5  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:9920:195302#GTTGCAAC 65  GL000200.1  29250   0   90M 6   49937260    0   AGACCAGAATTGCACCCATCAAATGCCTCACTCACCATATGTCAGCCCAGAAGACTCTTGCAGTGGTGAGCCAGTCTCTTTATCCACCAA  HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHFHHHHHHCHHHHHFFHHHHHHFHHHHHHFHFEFHHEHHHEFFE;GIFFFHBEHHBFHHD  X0:i:3  X1:i:3  XA:Z:9,+44481078,90M,0;9,+43322151,90M,0;9,-41877008,90M,1;9,+46921634,90M,1;9,+65982263,90M,1; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  XT:A:R
    FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 99  GL000200.1  64231   0   90M =   64292   151 CACTACGCCCGGCTAATTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCGCTTTAGCCGGGATGGTCTCGATCTCCTGACCTCGTGATC  EEEEEECDCECEEE@EDDDD96,60@@A><CA5DD2.)7/+;-(+,8<86=/=?=?)+:+*'1-9>197B<@3A/@@@8;=@########  X0:i:1  X1:i:1377   MD:Z:51T38  PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:1  SM:i:0  XM:i:1  XO:i:0  MQ:i:0  XT:A:U
    FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 147 GL000200.1  64292   0   90M =   64231   -151    GATGGTCGCGATCGCCTGACCTCGTGATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGTCACCGCGCCCGGCCGAG  #################################@ACDGDDB4A?DE?6:GA@@B;BFG;GCGCGGEB@E9B;;B7E=FF?GGEDEGBEGG  X0:i:72 MD:Z:7T5T58C17  PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:3  SM:i:0  XM:i:3  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 99  GL000200.1  72388   0   90M =   72561   263 TAAGTTGTAATGTTTAATTCTTTGAATGTTTCAGTGGGAGCTAGAAATTGGTTTGATATACTTTTTAGTTCAGTTGGAATACTTAACACT  HHHHHHHFHHHGHHHHGGHHGHHFGFHGHHHHFGHHHHHHHFFG?HFHFHHEEHHFFDFHEHFGHFDEEEEFFBFGHFHHHHHFGHHHH=  X0:i:3  X1:i:0  XA:Z:9,+43365464,90M,0;9,+44524392,90M,0;   MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 147 GL000200.1  72561   0   90M =   72388   -263    AAAGAATTGAAAAAAAAAGTGACACAAATTGATATATCACGCAAACTATGTGGTTTTGTATTTTCAACTAATTGCTGAAGAGCACTTATA  HGBHGGFGIGEHHGHHHGHHGHHFHHHHHHGHFHHFHHHHBHHHHHHHHHHHGHHGHHHHHHHFHGFHGHHHHEHHHHHHHHHHHHHHHH  X0:i:3  X1:i:0  XA:Z:9,-43365637,90M,0;9,-44524565,90M,0;   MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:3111:80943#GTTGCAAC  163 GL000200.1  107292  0   90M =   107352  150 TTGCTATTGACACAATCATTAACCAGAAATGTTTCAATGATGGATCTGATGAAAAGAAGAAGCTGTACTGTGTCTATGTTGTTATTGGTC  HEHHHHHHHHHGHHHGHHHHHHHHHHHHHHHCHHGHHHHHHHHHHHHHHHHHHHHHGHHGHHHHHHGHHHDFDGGFFFFBGEGFGBDEEE  X0:i:8  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:3111:80943#GTTGCAAC  83  GL000200.1  107352  0   90M =   107292  -150    AGCTGTACTGTGTCTATGTTGTTATTGGTCAAAAGAGATCCACTGTTGCCCAGTTGGTGAAGAGACTTACGATGCAGATGCCATGAATTA  FFGAGHDHHHHHHFHFHHHHHFHHECFHHEHHHHHHHHHHEBEGFHH@DEHHHHHHHHHGHHHHBHHHHHHFFHHHHHHHHHHHHHHHHH  X0:i:8  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:6010:24148#GTTGCAAC  99  GL000200.1  107805  0   90M =   107947  232 ATCTTCTTGGAAACAGAATTGTTCTACAAAGGTATCCACCCTGCCATTAATGTCGGTCTGTCTGTGTCTCGTGTCAGATCTGCTGCCCAA  HHHGHGHHFHBHHHHHEHHHHHHHCDFFGDE?CFDFDFDFEEFHHHH@HHDG;AFA6D>C9A?<<44<9>D?B>DFDD6F?4EDEECFEC  X0:i:6  X1:i:1  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1101:6010:24148#GTTGCAAC  147 GL000200.1  107947  0   90M =   107805  -232    ATCATGAGGTCACCACTTTTGCCCAGTTCAGTTCTGACCTCGATGCTGCCACTCAACAACTTTTGAGTTGTGGTGTGTGTCTAACTGAGT  DFFG8GEGGF@@@@<6/8DDHDHEGIGGGG@EEEEF<FBFHHBGHHGHGEFBGGDGFGEGHHHEHHHGCHGHHFHHHHHHHFGFGFHHHF  X0:i:7  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 99  GL000200.1  134050  0   90M =   134258  298 TTGATGACCTCCCCTTTTCCCAGGTCAAAGGAGAATTTGTCCTTGCGATCCACACTGGAGTCAAACTTTGTGCCCTCTAACAGCCAGCCA  HHGHHHHHGHHHHGHHHHGHHGHHHHGHHHGHHHEHHHHHHHHHHHHFBHHFFHFEGEFGHFHHEHBDEFCEGEEHH@GHGHHFFCHHHE  X0:i:4  X1:i:4  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 147 GL000200.1  134258  0   90M =   134050  -298    GCAGCGGCACCGGCTGCGCCGCACTCTCGGTCGCCTTCATCTCCTTGGCTGTCATCTCTGCGTGGCGCGAAATTTTTCCGGGAGATGGCG  ;38<4F?FCFD7DD=DCA5D<C<<@E?GHFHHGHGGFGGGG;GGFGDDGAGFGIEECEECGFGGCG>HHHHHGEHFHHHGHGHHHHHHHH  X0:i:4  X1:i:3  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1102:6709:6487#GTTGCAAC   99  GL000200.1  173813  0   90M =   173961  238 TGGCACCCTGCAAATAAACACCTCTTTTCTCCTGCTGCAAACCTTGGTGTGGGTGTTTGGCCTGACTGCGCTGGGCAGGCAGACCCAGCT  FGGGE?GGBGGGGGGGGGGGGCGFGGGGGGGGGGFFCGGBGFGGGGGEGF>FFCFGFFGFEEGGEFGG?BFDDD3EE@6EEFFFEGDFGA  X0:i:6  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R
    FCD03KHACXX:7:1102:6709:6487#GTTGCAAC   147 GL000200.1  173961  0   90M =   173813  -238    TATAAATTCCAGGCTGGGCAGAGTGGCTCACACCTGTAATCCTAGCACTTTGGGAGGCCGAAGCTGGTGGATCACCTGAGGTCAGGAAGT  @DA8DEEEG=DCGGGDDC88FGGFBEGGGHG:GGGEEDDBFFGGGBAGGG7HHGHHFHGFGEG@FGGGDEHHFEEHHGHHHDDFHHHHHH  X0:i:6  X1:i:0  MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R

    any chance to avoid this error? maybe avoiding processing extra-chromosomal regions?

    thanks for any help you might offer!

    Francesco

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi Francesco,

    Are you passing the intervals list of capture targets? If you're working with exome data that is recommended. And if you find that this problem specifically occurs with non-chromosome contigs, then you just don't include those contigs in your intervals list.

    Geraldine Van der Auwera, PhD

  • flescaiflescai Posts: 53Member ✭✭

    Thanks Geraldine, that actually solved the problem :-) I was only passing the target intervals during the calling process, and not during the data processing pipeline.

    It completely make sense, although I might loose just a bit of information in the recalibrated bam file. thanks for your help!

  • MinQiaoMinQiao Posts: 3Member

    Geraldine, I am running into a similar issue, only that the recalibrator seems to work with proper output, but my table always remains zero byte. My command is

    java -Xmx5000m -jar GenomeAnalysisTK.jar -T BaseRecalibrator -I LS1.clean.dedup.bam -R ucsc.hg19.fasta -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -knownSites dbsnp_137.hg19.vcf -knownSites hapmap_3.3.hg19.vcf -knownSites 1000G_omni2.5.hg19.vcf -knownSites Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites 1000G_phase1.indels.hg19.vcf -o LS1.pre_recal.table

    and the screen output seems right (below). However nothing was written to file "LS1.pre_recal.table" nor error message was reported. What could possibly go wrong with my alignment? I used bwa version 0.6.2 and I've already completed indel realignment and mark duplicate before I got here. Please help!

    `Date/Time: 2013/05/08 17:57:32 INFO 17:57:32,951 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:32,951 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:32,960 ArgumentTypeDescriptor - Dynamically determined type of dbsnp_137.hg19.vcf to be VCF INFO 17:57:32,961 ArgumentTypeDescriptor - Dynamically determined type of hapmap_3.3.hg19.vcf to be VCF INFO 17:57:32,962 ArgumentTypeDescriptor - Dynamically determined type of 1000G_omni2.5.hg19.vcf to be VCF INFO 17:57:32,963 ArgumentTypeDescriptor - Dynamically determined type of Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF INFO 17:57:32,964 ArgumentTypeDescriptor - Dynamically determined type of 1000G_phase1.indels.hg19.vcf to be VCF INFO 17:57:32,997 GenomeAnalysisEngine - Strictness is SILENT INFO 17:57:33,061 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 17:57:33,065 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 17:57:33,076 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 17:57:33,084 RMDTrackBuilder - Loading Tribble index from disk for file dbsnp_137.hg19.vcf INFO 17:57:33,171 RMDTrackBuilder - Loading Tribble index from disk for file hapmap_3.3.hg19.vcf INFO 17:57:33,190 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_omni2.5.hg19.vcf INFO 17:57:33,204 RMDTrackBuilder - Loading Tribble index from disk for file Mills_and_1000G_gold_standard.indels.hg19.vcf INFO 17:57:33,222 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_phase1.indels.hg19.vcf INFO 17:57:33,271 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 17:57:33,274 GenomeAnalysisEngine - Done creating shard strategy INFO 17:57:33,274 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 17:57:33,274 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 17:57:33,328 BaseRecalibrator - The covariates being used here:
    INFO 17:57:33,328 BaseRecalibrator - ReadGroupCovariate INFO 17:57:33,328 BaseRecalibrator - QualityScoreCovariate INFO 17:57:33,328 BaseRecalibrator - ContextCovariate INFO 17:57:33,328 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 17:57:33,329 BaseRecalibrator - CycleCovariate INFO 17:57:33,330 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 17:57:33,331 ReadShardBalancer$1 - Done loading BAM index data for next contig INFO 17:58:03,384 ProgressMeter - chr1:1457505 2.86e+05 30.0 s 105.0 s 0.0% 17.7 h 17.7 h INFO 17:59:03,468 ProgressMeter - chr1:5271635 1.19e+06 90.0 s 76.0 s 0.2% 14.8 h 14.8 h INFO 18:00:03,469 ProgressMeter - chr1:8794195 2.19e+06 2.5 m 68.0 s 0.3% 14.8 h 14.8 h INFO 18:01:04,022 ProgressMeter - chr1:12483444 3.09e+06 3.5 m 68.0 s 0.4% 14.6 h 14.6 h INFO 18:02:04,031 ProgressMeter - chr1:16261146 3.89e+06 4.5 m 69.0 s 0.5% 14.5 h 14.4 h ....... .........

    `

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @MinQiao,

    What does the output say at the end of the run? Maybe a lot of reads are getting filtered out.

    Geraldine Van der Auwera, PhD

  • MinQiaoMinQiao Posts: 3Member

    @Geraldine_VdAuwera

    Thank you so much for helping me out. I cannot reach the cluster for now but I am sure the last line of output were similar to the rest of them, just a progress report. No warning or error message. By what criteria will some reads (in my case seems all reads) be filtered out? Is it an indication of very bad quality of my data or completely wrong mapping? Thank you!

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    There are many different things that can cause reads to be filtered out, e.g. if they are badly mapped or badly formatted. At the end of a run the GATK always prints out a summary of the filtering results; it says how many reads were seen in total, how many filtered out, and the breakdown of numbers between the different reasons why they were filtered. That can be very helpful to understand what quality issues may affect your data.

    Geraldine Van der Auwera, PhD

  • MinQiaoMinQiao Posts: 3Member

    Thank you @Geraldine_VdAuwera, for the information of expected output! I just double-checked mine, which on the last line instead of the summary it has progress meter at chromosome 22. I will find another computer to give it a go. Sorry for having asked a probably dumb question

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Ah, it sounds like your run failed without exiting cleanly. if you try again hopefully it will complete normally.

    Don't worry, there are no dumb questions.

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    When running GATK tools I am getting the same error and I have read through forum but could not find a solution to my problem.

    This is my script:

    java -Xmx4g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp

    This is my error as well: INFO 11:41:10,566 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.IllegalStateException: recalibration tables list is empty at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.mergeThreadLocalRecalibrationTables(RecalibrationEngine.java:209) at org.broadinstitute.sting.gatk.walkers.bqsr.RecalibrationEngine.finalizeData(RecalibrationEngine.java:175) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:508) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.onTraversalDone(BaseRecalibrator.java:131) at org.broadinstitute.sting.gatk.executive.Accumulator$StandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:123) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.4-9-g532efad):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: recalibration tables list is empty
    ERROR ------------------------------------------------------------------------------------------

    I have used RealignerTargetCreator and IndelRealigner without any issues have gotten the correct output I need. But for some reason at the BaseRecalibrator step I am getting this error. If someone could please help me troubleshoot this.

    Thanks, Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi @Sinan, can you please try again with the latest version of the gatk?

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member
  • sir2013sir2013 Posts: 17Member
    edited May 2013

    Hello again, for some reason when running BaseRecalibration I am getting zero processed reads which is quite interesting. Do you have any idea as why this is occuring, also I do get an output with zero recalibration information.

    Here is my command: java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf --validation_strictness STRICT -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp

    
    running screen:
    INFO  15:10:40,075 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:10:40,077 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
    INFO  15:10:40,077 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO  15:10:40,077 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO  15:10:40,082 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp
    INFO  15:10:40,082 HelpFormatter - Date/Time: 2013/05/14 15:10:40
    INFO  15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
    INFO  15:10:40,104 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF
    INFO  15:10:40,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
    INFO  15:10:40,127 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF
    INFO  15:10:42,750 GenomeAnalysisEngine - Strictness is SILENT
    INFO  15:10:43,025 GenomeAnalysisEngine - Downsampling Settings: No downsampling
    INFO  15:10:43,033 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO  15:10:43,051 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
    INFO  15:10:43,093 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf
    INFO  15:10:43,407 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf
    INFO  15:10:44,262 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf
    INFO  15:10:44,588 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
    INFO  15:10:44,599 GenomeAnalysisEngine - Done creating shard strategy
    INFO  15:10:44,600 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO  15:10:44,600 ProgressMeter -        Location processed.reads  runtime per.1M.reads completed total.runtime remaining
    INFO  15:10:44,730 BaseRecalibrator - The covariates being used here:
    INFO  15:10:44,731 BaseRecalibrator -  ReadGroupCovariate
    INFO  15:10:44,731 BaseRecalibrator -  QualityScoreCovariate
    INFO  15:10:44,732 BaseRecalibrator -  ContextCovariate
    INFO  15:10:44,732 ContextCovariate -   Context sizes: base substitution model 2, indel substitution model 3
    INFO  15:10:44,733 BaseRecalibrator -  CycleCovariate
    INFO  15:10:44,738 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO  15:10:44,741 ReadShardBalancer$1 - Done loading BAM index data for next contig
    INFO  15:11:14,658 ProgressMeter -        Starting        0.00e+00   30.0 s       49.7 w    100.0%        30.0 s     0.0 s
    INFO  15:11:44,662 ProgressMeter -        Starting        0.00e+00   60.0 s       99.3 w    100.0%        60.0 s     0.0 s
    INFO  15:12:15,185 ProgressMeter -        Starting        0.00e+00   90.0 s      149.8 w    100.0%        90.0 s     0.0 s
    INFO  15:12:45,186 ProgressMeter -        Starting        0.00e+00  120.0 s      199.4 w    100.0%       120.0 s     0.0 s
    INFO  15:13:15,189 ProgressMeter -        Starting        0.00e+00    2.5 m      249.0 w    100.0%         2.5 m     0.0 s
    INFO  15:13:45,191 ProgressMeter -        Starting        0.00e+00    3.0 m      298.6 w    100.0%         3.0 m     0.0 s
    INFO  15:14:15,193 ProgressMeter -        Starting        0.00e+00    3.5 m      348.2 w    100.0%         3.5 m     0.0 s
    INFO  15:14:44,687 ReadShardBalancer$1 - Loading BAM index data for next contig
    INFO  15:14:44,692 BaseRecalibrator - Calculating quantized quality scores...
    INFO  15:14:45,195 ProgressMeter -        Starting        0.00e+00    4.0 m      397.8 w    100.0%         4.0 m     0.0 s
    INFO  15:14:45,576 BaseRecalibrator - Writing recalibration report...
    INFO  15:14:46,197 BaseRecalibrator - ...done!
    **INFO  15:14:46,200 BaseRecalibrator - Processed: 0 reads**
    INFO  15:14:46,209 ProgressMeter -            done        0.00e+00    4.0 m      399.5 w    100.0%         4.0 m     0.0 s
    INFO  15:14:46,216 ProgressMeter - Total runtime 241.62 secs, 4.03 min, 0.07 hours
    INFO  15:14:47,683 GATKRunReport - Uploaded run statistics report to AWS S3
    

    and output file information in it:

    :GATKReport.v1.1:5

    :GATKTable:2:18:%s:%s:;

    :GATKTable:Arguments:Recalibration argument collection values used in this run

    Argument Value binary_tag_name null covariate ReadGroupCovariate,QualityScoreCovariate,ContextCovariate,CycleCovariate default_platform null deletions_default_quality 45 force_platform null indels_context_size 3 insertions_default_quality 45 low_quality_tail 2 maximum_cycle_value 500 mismatches_context_size 2 mismatches_default_quality -1 no_standard_covs false plot_pdf_file null quantizing_levels 16 recalibration_report null run_without_dbsnp false solid_nocall_strategy THROW_EXCEPTION solid_recal_mode SET_Q_ZERO

    :GATKTable:3:94:%s:%s:%s:;

    :GATKTable:Quantized:Quality quantization map

    QualityScore Count QuantizedScore 0 0 93 1 0 93 2 0 93 3 0 93 4 0 93 5 0 93 6 0 93 7 0 93 8 0 93 9 0 93 10 0 93 11 0 93 12 0 93 13 0 93 14 0 93 15 0 93 16 0 93 17 0 93 18 0 93 19 0 93 20 0 93 21 0 93 22 0 93 23 0 93 24 0 93 25 0 93 26 0 93 27 0 93 28 0 93 29 0 93 30 0 93 31 0 93 32 0 93 33 0 93 34 0 93 35 0 93 36 0 93 37 0 93 38 0 93 39 0 93 40 0 93 41 0 93 42 0 93 43 0 93 44 0 93 45 0 93 46 0 93 47 0 93 48 0 93 49 0 93 50 0 93 51 0 93 52 0 93 53 0 93 54 0 93 55 0 93 56 0 93 57 0 93 58 0 93 59 0 93 60 0 93 61 0 93 62 0 93 63 0 93 64 0 93 65 0 93 66 0 93 67 0 93 68 0 93 69 0 93 70 0 93 71 0 93 72 0 93 73 0 93 74 0 93 75 0 93 76 0 93 77 0 93 78 0 93 79 0 79 80 0 80 81 0 81 82 0 82 83 0 83 84 0 84 85 0 85 86 0 86 87 0 87 88 0 88 89 0 89 90 0 90 91 0 91 92 0 92 93 0 93

    :GATKTable:6:0:%s:%s:%.4f:%.4f:%d:%.2f:;

    :GATKTable:RecalTable0:

    ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors

    :GATKTable:6:0:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable1:

    ReadGroup QualityScore EventType EmpiricalQuality Observations Errors

    :GATKTable:8:0:%s:%s:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable2:

    ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors

    I do apologize for the long post in the forum. I just dont understand why no Errors are being given as well and no recalibration is being processed.

    Thanks, Sinan

    Post edited by Mark_DePristo on
  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    can you check if your BAM file has any reads? Sounds silly but it could be something as simple as that.

    Also you don't need to specify the -cov parameters. Those are the default covariates and if you specify them like that, I am afraid it may be confusing the tool. Can you remove those parameters and check if it works? (I'll issue a bug report if that's the case)

  • sir2013sir2013 Posts: 17Member

    I know the bam file is not empty because for the IndelRealigner process I had to have the quality scores fixed by using -fixMisencodedQuals and I cross checked it with original bam file to see if the scores were actually adjusted accordingly. Unfortunately I got the same output

    Command Line Code:

    java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp

    Running Script Output:

    INFO 11:20:21,953 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:21,964 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 INFO 11:20:21,965 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 11:20:21,965 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 11:20:21,978 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp INFO 11:20:21,979 HelpFormatter - Date/Time: 2013/05/16 11:20:21 INFO 11:20:21,980 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:21,980 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:22,072 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF INFO 11:20:22,090 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF INFO 11:20:22,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF INFO 11:20:23,555 GenomeAnalysisEngine - Strictness is SILENT INFO 11:20:23,845 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 11:20:23,852 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 11:20:23,899 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 INFO 11:20:23,947 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf INFO 11:20:24,363 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf INFO 11:20:24,653 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf INFO 11:20:26,702 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 11:20:26,721 GenomeAnalysisEngine - Done creating shard strategy INFO 11:20:26,722 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 11:20:26,723 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 11:20:27,050 BaseRecalibrator - The covariates being used here: INFO 11:20:27,051 BaseRecalibrator - ReadGroupCovariate INFO 11:20:27,052 BaseRecalibrator - QualityScoreCovariate INFO 11:20:27,052 BaseRecalibrator - ContextCovariate INFO 11:20:27,053 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 11:20:27,054 BaseRecalibrator - CycleCovariate INFO 11:20:27,064 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 11:20:27,068 ReadShardBalancer$1 - Done loading BAM index data for next contig INFO 11:20:56,840 ProgressMeter - Starting 0.00e+00 30.0 s 49.8 w 100.0% 30.0 s 0.0 s INFO 11:21:26,842 ProgressMeter - Starting 0.00e+00 60.0 s 99.4 w 100.0% 60.0 s 0.0 s INFO 11:21:56,844 ProgressMeter - Starting 0.00e+00 90.0 s 149.0 w 100.0% 90.0 s 0.0 s INFO 11:22:26,846 ProgressMeter - Starting 0.00e+00 120.0 s 198.6 w 100.0% 120.0 s 0.0 s INFO 11:22:49,994 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 11:22:49,997 BaseRecalibrator - Calculating quantized quality scores... INFO 11:22:50,101 BaseRecalibrator - Writing recalibration report... INFO 11:22:50,151 BaseRecalibrator - ...done! INFO 11:22:50,151 BaseRecalibrator - Processed: 0 reads INFO 11:22:50,153 ProgressMeter - done 0.00e+00 2.4 m 237.2 w 100.0% 2.4 m 0.0 s INFO 11:22:50,154 ProgressMeter - Total runtime 143.43 secs, 2.39 min, 0.04 hours INFO 11:22:51,209 GATKRunReport - Uploaded run statistics report to AWS S3

    Information in output file recal_data.grp:

    :GATKReport.v1.1:5

    :GATKTable:2:18:%s:%s:;

    :GATKTable:Arguments:Recalibration argument collection values used in this run

    Argument Value binary_tag_name null covariate ReadGroupCovariate,QualityScoreCovariate,ContextCovariate,CycleCovariate default_platform null deletions_default_quality 45 force_platform null indels_context_size 3 insertions_default_quality 45 low_quality_tail 2 maximum_cycle_value 500 mismatches_context_size 2 mismatches_default_quality -1 no_standard_covs false plot_pdf_file null quantizing_levels 16 recalibration_report null run_without_dbsnp false solid_nocall_strategy THROW_EXCEPTION solid_recal_mode SET_Q_ZERO

    :GATKTable:3:94:%s:%s:%s:;

    :GATKTable:Quantized:Quality quantization map

    QualityScore Count QuantizedScore 0 0 93 1 0 93 2 0 93 3 0 93 4 0 93 5 0 93 6 0 93 7 0 93 8 0 93 9 0 93 10 0 93 11 0 93 12 0 93 13 0 93 14 0 93 15 0 93 16 0 93 17 0 93 18 0 93 19 0 93 20 0 93 21 0 93 22 0 93 23 0 93 24 0 93 25 0 93 26 0 93 27 0 93 28 0 93 29 0 93 30 0 93 31 0 93 32 0 93 33 0 93 34 0 93 35 0 93 36 0 93 37 0 93 38 0 93 39 0 93 40 0 93 41 0 93 42 0 93 43 0 93 44 0 93 45 0 93 46 0 93 47 0 93 48 0 93 49 0 93 50 0 93 51 0 93 52 0 93 53 0 93 54 0 93 55 0 93 56 0 93 57 0 93 58 0 93 59 0 93 60 0 93 61 0 93 62 0 93 63 0 93 64 0 93 65 0 93 66 0 93 67 0 93 68 0 93 69 0 93 70 0 93 71 0 93 72 0 93 73 0 93 74 0 93 75 0 93 76 0 93 77 0 93 78 0 93 79 0 79 80 0 80 81 0 81 82 0 82 83 0 83 84 0 84 85 0 85 86 0 86 87 0 87 88 0 88 89 0 89 90 0 90 91 0 91 92 0 92 93 0 93

    :GATKTable:6:0:%s:%s:%.4f:%.4f:%d:%.2f:;

    :GATKTable:RecalTable0:

    ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors

    :GATKTable:6:0:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable1:

    ReadGroup QualityScore EventType EmpiricalQuality Observations Errors

    :GATKTable:8:0:%s:%s:%s:%s:%s:%.4f:%d:%.2f:;

    :GATKTable:RecalTable2:

    ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    this is very strange. How big is your BAM file? Can you share it for us to debug this ?

  • sir2013sir2013 Posts: 17Member

    Sure I can share the bam file. Question is, how would I do that? I have used filezilla to download the bundle pack you have. Is there a specific folder I should put in there and how would you like it name to distinguish it?

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin
    edited May 2013

    You can upload it to our FTP server. Instructions are here. Just let me know when you have done so and we will start debugging it internally.

    Thank you very much.

    Post edited by Carneiro on
  • sir2013sir2013 Posts: 17Member
  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    If you can reproduce the error with a tiny version of your BAM file (which you can create with PrintReads using -L ) then you can just attach your file to this thread, which is optimal.

  • sir2013sir2013 Posts: 17Member

    I am sorry I have not gotten to the printreads step yet when you say use -L is there an input for that argument? if you give me an example so I can attach the file

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    nevermind, just upload the whole file. 2.1G is fairly small.

  • sir2013sir2013 Posts: 17Member

    Ok, it seems to be taking forever for the uploading it has been saying "uploading" for the past 4 hours. Is there another way to get this to you.

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    If you have any place to put it, we can download it from our end. But the FTP is the preferred method.

  • sir2013sir2013 Posts: 17Member

    I created a folder under my name "Sinan" and I uploaded on the FTP for uploads. There you will see 1024_D_realigned.bam, this bam file has already successfully gone through the RealTargetCreator and IndelRealigner. I do hope to hear some good news because I tired running other bam files which were unsuccessful 0 reads processed again.

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin
  • sir2013sir2013 Posts: 17Member

    Hello, I was wondering if there was any update or if a solution has been found to my problem.

    Thanks, Sinan

  • CarneiroCarneiro Posts: 274Administrator, GATK Developer admin

    It seems like your BAM file has MQ 255 reads, that's why they're all being filtered out.

  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Developer admin

    Yes, the newest GATK will print a more informative message on this problem. It will also be possible to fix by adding -rf ReassignMappingQuality to the command line. Note this will only work in the nightly build and will come out with GATK 2.6

    -- Mark A. DePristo, Ph.D. Co-Director, Medical and Population Genetics Broad Institute of MIT and Harvard

  • sir2013sir2013 Posts: 17Member

    Should this have been fixed when I specify -fixMisencodedQuals while doing the IndelRealigner? I checked the output of the new bam to the old bam and I could see the adjustments had been made

    Thanks, Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi Sinan,

    -fixMisencodedQuals is meant to fix a different issue which concerns base qualities, not mapping qualities (see release highlights for 2.3 for more details). Are you still having problems?

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    Hello,

    I thought I would bring this to your attention regarding the MQ255. I use Star to run my alignments and just as tophat, star has the same MQ annotation.

    255 = uniquely mapped 3 = maps to 2 locations 2 = maps to 3 locations 1 = maps to 4-9 locations 0 = 10 or more locations

    So as you can see there is no score actually being assigned for MQ but bwa does give an actually scoring. I was wondering if there is conversion for all 5 scores other then 255 being converted to 60. So that I can proper processes my data through GATK tools

    Thanks, Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,423Administrator, GATK Developer admin

    Hi Sinan,

    The GATK will only consider uniquely mapped reads, so converting the MQ 255 values is the only step necessary. The other reads will be ignored.

    Geraldine Van der Auwera, PhD

  • sir2013sir2013 Posts: 17Member

    Ok, thank you very much for all the help I have finally got the BaseRecalibration step to work cheers!

    Sinan

Sign In or Register to comment.