The current GATK version is 3.4-0

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# BaseRecalibration: Recalibration table is empty

Baylor HealthPosts: 3Member

When running GATK, I am getting "empty" results when running BaseRecalibrator. I didn't see a solution to this when searching.

java -Xmx4g -jar /seqprg/GenomeAnalysisTK-2.4-3-g2a7af43/GenomeAnalysisTK.jar -l INFO -R /Users/bcantarel/projects/refdb/human_g1k_v37.fasta --knownSites /Users/bcantarel/projects/refdb/00-All.vcf -I Sample_cDNA405.bam -T BaseRecalibrator -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o Sample_cDNA405.grp

##### ERROR stack trace

java.lang.IllegalStateException: recalibration tables list is empty
at org.broadinstitute.sting.gatk.executive.AccumulatorStandardAccumulator.finishTraversal(Accumulator.java:129) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:123) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:283) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91) ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR A GATK RUNTIME ERROR has occurred (version 2.4-3-g2a7af43): ##### ERROR ##### ERROR Please visit the wiki to see if this is a known problem ##### ERROR If not, please post the error, with stack trace, to the GATK forum ##### ERROR Visit our website and forum for extensive documentation and answers to ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk ##### ERROR ##### ERROR MESSAGE: recalibration tables list is empty ##### ERROR ------------------------------------------------------------------------------------------ Tagged: ## Best Answer ## Answers • Baylor HealthPosts: 3Member Hmm, ok, thanks for the hint. Maybe the error message should say that "Low Alignment Rate" or "Too few reads aligning" or something like like... Thanks again! • Posts: 8,171Administrator, GATK Dev admin OK, I'll see if we can make the error message clearer. Geraldine Van der Auwera, PhD • Baylor HealthPosts: 3Member On a side note -- the problem is from a bug in BWA (ie produces SAMs without mapped reads) -- so if anyone was unlucky enough to download BWA 0.7.0 -- well it was only up 1 week so, the issues are likely fixed in the new version. • Posts: 8,171Administrator, GATK Dev admin Ah, the aligner taking a vacation would indeed be a problem! Thanks for letting us know what was the real problem -- it's very useful for us to hear about the range of issues people have that can lead to common symptoms like this. Geraldine Van der Auwera, PhD • Posts: 61Member ✭✭ I am running into the same problem, and yes I'm using exome data here. I was using a previous version of BWA (0.6.2) and updated to the most recent one (0.7.3a) but I still have the same problem. It happens in intervals like GL000200.1 1 187035 + interval_80 the portion of the code that gives this error is actually 'java' '-Xmx4096m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/me/tests/gatk/.queue/tmp' '-cp' '/home/me/tools/gatk-protected/dist/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'BaseRecalibrator' '-I' '/home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam' '-L' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/scatter.intervals' '-R' '/home/me/resources/GATKbundle/2.3/b37/human_g1k_v37.fasta' '-DIQ' '-knownSites' '/home/me/resources/GATKbundle/2.3/b37/dbsnp_137.b37.vcf' '-o' '/home/me/tests/gatk/.queue/scatterGather/.qlog/test04.FOSZAW_F2_1.fq.gz.pre_recal.table.covariates-sg/temp_80_of_84/test04.FOSZAW_F2_1.fq.gz.pre_recal.table' '-cov' 'ReadGroupCovariate' '-cov' 'QualityScoreCovariate' '-cov' 'CycleCovariate' '-cov' 'ContextCovariate' '-dP' 'Illumina' I therefore checked what's in the bam file and this is the output samtools view /home/me/tests/gatk/test04.FOSZAW_F2_1.fq.gz.clean.dedup.bam GL000200.1:1-187035 FCD03KHACXX:7:1101:5083:90023#GTTGCAAC 163 GL000200.1 20411 0 90M = 20588 267 CATAGGAAATAGTTACCAAGAAATGCAGCAGCTAAACTTGGAAGGAAAGAACTATTGCACAGCCAAAACATTGTACATATCTGATTTAGA GGGEDFBDFDFDEFDI;GGGBGEG=D;@;DGBGGEFHBFFBDBF8D:AD?:<>=AGBGEGC8@DDE=EEE?BD<DEGD=DB@EBAD<DDC X0:i:5 X1:i:0 MD:Z:13G76 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:1 SM:i:0 XM:i:1 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:5083:90023#GTTGCAAC 83 GL000200.1 20588 0 90M = 20411 -267 TTCCAAAAAGAAGCAGTCATTGAAAAATGCTGACTTATGCATTGCCTCAGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGAC ?E@FCFEFEEB@EDEFD?BFEEFGGFFCDFFBAECDFD>EFEGGGBCAB7DGBGGFFCFCGGFFEGEGGGFFFFBDEE6EFFF@FGGGFG X0:i:4 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 163 GL000200.1 20636 0 90M = 20755 209 AGGAAAAAAGGTGGCTCTGTTTAATCGACTACTATCCCAGACAGTTAGTACCAGATACTTGCACGTAGAAAGAGGTAATTTTCATGCTAG HHHHHHHHHGHFHHHHHHGHHHHHHHHHHHEHHHHHHHHHBEGHDHHHFHHHHHFHCHHGHFHHHFGGFGFGG@FCE5=EGFGEIHHFHE X0:i:4 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:2018:151662#GTTGCAAC 83 GL000200.1 20755 0 90M = 20636 -209 TTCTTGGATGATGATGGATCAGAAGGAGAAGAATTCACAGTCTGAGATGGCTACATTCATTATGGACAAACAGTCAAACTTGTGTGCTCA H4HHHFHHHHHHHHHHHHHFHHHFHHGHFEHFHFFHHHHHHHHHHHHGHHHFHHHEBHHFHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH X0:i:5 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:9920:195302#GTTGCAAC 65 GL000200.1 29250 0 90M 6 49937260 0 AGACCAGAATTGCACCCATCAAATGCCTCACTCACCATATGTCAGCCCAGAAGACTCTTGCAGTGGTGAGCCAGTCTCTTTATCCACCAA HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHFHHHHHHCHHHHHFFHHHHHHFHHHHHHFHFEFHHEHHHEFFE;GIFFFHBEHHBFHHD X0:i:3 X1:i:3 XA:Z:9,+44481078,90M,0;9,+43322151,90M,0;9,-41877008,90M,1;9,+46921634,90M,1;9,+65982263,90M,1; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 XT:A:R FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 99 GL000200.1 64231 0 90M = 64292 151 CACTACGCCCGGCTAATTTTTTGTATTTTTAGTAGAGACGGGGTTTCACCGCTTTAGCCGGGATGGTCTCGATCTCCTGACCTCGTGATC EEEEEECDCECEEE@EDDDD96,60@@A><CA5DD2.)7/+;-(+,8<86=/=?=?)+:+*'1-9>197B<@3A/@@@8;=@######## X0:i:1 X1:i:1377 MD:Z:51T38 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:1 SM:i:0 XM:i:1 XO:i:0 MQ:i:0 XT:A:U FCD03KHACXX:7:1101:18445:85126#GTTGCAAC 147 GL000200.1 64292 0 90M = 64231 -151 GATGGTCGCGATCGCCTGACCTCGTGATCCGCCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGTCACCGCGCCCGGCCGAG #################################@ACDGDDB4A?DE?6:GA@@B;BFG;GCGCGGEB@E9B;;B7E=FF?GGEDEGBEGG X0:i:72 MD:Z:7T5T58C17 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:3 SM:i:0 XM:i:3 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 99 GL000200.1 72388 0 90M = 72561 263 TAAGTTGTAATGTTTAATTCTTTGAATGTTTCAGTGGGAGCTAGAAATTGGTTTGATATACTTTTTAGTTCAGTTGGAATACTTAACACT HHHHHHHFHHHGHHHHGGHHGHHFGFHGHHHHFGHHHHHHHFFG?HFHFHHEEHHFFDFHEHFGHFDEEEEFFBFGHFHHHHHFGHHHH= X0:i:3 X1:i:0 XA:Z:9,+43365464,90M,0;9,+44524392,90M,0; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:15528:62078#GTTGCAAC 147 GL000200.1 72561 0 90M = 72388 -263 AAAGAATTGAAAAAAAAAGTGACACAAATTGATATATCACGCAAACTATGTGGTTTTGTATTTTCAACTAATTGCTGAAGAGCACTTATA HGBHGGFGIGEHHGHHHGHHGHHFHHHHHHGHFHHFHHHHBHHHHHHHHHHHGHHGHHHHHHHFHGFHGHHHHEHHHHHHHHHHHHHHHH X0:i:3 X1:i:0 XA:Z:9,-43365637,90M,0;9,-44524565,90M,0; MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:3111:80943#GTTGCAAC 163 GL000200.1 107292 0 90M = 107352 150 TTGCTATTGACACAATCATTAACCAGAAATGTTTCAATGATGGATCTGATGAAAAGAAGAAGCTGTACTGTGTCTATGTTGTTATTGGTC HEHHHHHHHHHGHHHGHHHHHHHHHHHHHHHCHHGHHHHHHHHHHHHHHHHHHHHHGHHGHHHHHHGHHHDFDGGFFFFBGEGFGBDEEE X0:i:8 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:3111:80943#GTTGCAAC 83 GL000200.1 107352 0 90M = 107292 -150 AGCTGTACTGTGTCTATGTTGTTATTGGTCAAAAGAGATCCACTGTTGCCCAGTTGGTGAAGAGACTTACGATGCAGATGCCATGAATTA FFGAGHDHHHHHHFHFHHHHHFHHECFHHEHHHHHHHHHHEBEGFHH@DEHHHHHHHHHGHHHHBHHHHHHFFHHHHHHHHHHHHHHHHH X0:i:8 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:6010:24148#GTTGCAAC 99 GL000200.1 107805 0 90M = 107947 232 ATCTTCTTGGAAACAGAATTGTTCTACAAAGGTATCCACCCTGCCATTAATGTCGGTCTGTCTGTGTCTCGTGTCAGATCTGCTGCCCAA HHHGHGHHFHBHHHHHEHHHHHHHCDFFGDE?CFDFDFDFEEFHHHH@HHDG;AFA6D>C9A?<<44<9>D?B>DFDD6F?4EDEECFEC X0:i:6 X1:i:1 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1101:6010:24148#GTTGCAAC 147 GL000200.1 107947 0 90M = 107805 -232 ATCATGAGGTCACCACTTTTGCCCAGTTCAGTTCTGACCTCGATGCTGCCACTCAACAACTTTTGAGTTGTGGTGTGTGTCTAACTGAGT DFFG8GEGGF@@@@<6/8DDHDHEGIGGGG@EEEEF<FBFHHBGHHGHGEFBGGDGFGEGHHHEHHHGCHGHHFHHHHHHHFGFGFHHHF X0:i:7 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 99 GL000200.1 134050 0 90M = 134258 298 TTGATGACCTCCCCTTTTCCCAGGTCAAAGGAGAATTTGTCCTTGCGATCCACACTGGAGTCAAACTTTGTGCCCTCTAACAGCCAGCCA HHGHHHHHGHHHHGHHHHGHHGHHHHGHHHGHHHEHHHHHHHHHHHHFBHHFFHFEGEFGHFHHEHBDEFCEGEEHH@GHGHHFFCHHHE X0:i:4 X1:i:4 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1102:10841:11769#GTTGCAAC 147 GL000200.1 134258 0 90M = 134050 -298 GCAGCGGCACCGGCTGCGCCGCACTCTCGGTCGCCTTCATCTCCTTGGCTGTCATCTCTGCGTGGCGCGAAATTTTTCCGGGAGATGGCG ;38<4F?FCFD7DD=DCA5D<C<<@E?GHFHHGHGGFGGGG;GGFGDDGAGFGIEECEECGFGGCG>HHHHHGEHFHHHGHGHHHHHHHH X0:i:4 X1:i:3 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1102:6709:6487#GTTGCAAC 99 GL000200.1 173813 0 90M = 173961 238 TGGCACCCTGCAAATAAACACCTCTTTTCTCCTGCTGCAAACCTTGGTGTGGGTGTTTGGCCTGACTGCGCTGGGCAGGCAGACCCAGCT FGGGE?GGBGGGGGGGGGGGGCGFGGGGGGGGGGFFCGGBGFGGGGGEGF>FFCFGFFGFEEGGEFGG?BFDDD3EE@6EEFFFEGDFGA X0:i:6 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R FCD03KHACXX:7:1102:6709:6487#GTTGCAAC 147 GL000200.1 173961 0 90M = 173813 -238 TATAAATTCCAGGCTGGGCAGAGTGGCTCACACCTGTAATCCTAGCACTTTGGGAGGCCGAAGCTGGTGGATCACCTGAGGTCAGGAAGT @DA8DEEEG=DCGGGDDC88FGGFBEGGGHG:GGGEEDDBFFGGGBAGGG7HHGHHFHGFGEG@FGGGDEHHFEEHHGHHHDDFHHHHHH X0:i:6 X1:i:0 MD:Z:90 PG:Z:MarkDuplicates RG:Z:FOSZAW_F2_1.fq.gz_bwa XG:i:0 AM:i:0 NM:i:0 SM:i:0 XM:i:0 XO:i:0 MQ:i:0 XT:A:R any chance to avoid this error? maybe avoiding processing extra-chromosomal regions? thanks for any help you might offer! Francesco • Posts: 8,171Administrator, GATK Dev admin Hi Francesco, Are you passing the intervals list of capture targets? If you're working with exome data that is recommended. And if you find that this problem specifically occurs with non-chromosome contigs, then you just don't include those contigs in your intervals list. Geraldine Van der Auwera, PhD • Posts: 61Member ✭✭ Thanks Geraldine, that actually solved the problem :-) I was only passing the target intervals during the calling process, and not during the data processing pipeline. It completely make sense, although I might loose just a bit of information in the recalibrated bam file. thanks for your help! • Posts: 3Member Geraldine, I am running into a similar issue, only that the recalibrator seems to work with proper output, but my table always remains zero byte. My command is java -Xmx5000m -jar GenomeAnalysisTK.jar -T BaseRecalibrator -I LS1.clean.dedup.bam -R ucsc.hg19.fasta -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -knownSites dbsnp_137.hg19.vcf -knownSites hapmap_3.3.hg19.vcf -knownSites 1000G_omni2.5.hg19.vcf -knownSites Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites 1000G_phase1.indels.hg19.vcf -o LS1.pre_recal.table and the screen output seems right (below). However nothing was written to file "LS1.pre_recal.table" nor error message was reported. What could possibly go wrong with my alignment? I used bwa version 0.6.2 and I've already completed indel realignment and mark duplicate before I got here. Please help! Date/Time: 2013/05/08 17:57:32 INFO 17:57:32,951 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:32,951 HelpFormatter - -------------------------------------------------------------------------------- INFO 17:57:32,960 ArgumentTypeDescriptor - Dynamically determined type of dbsnp_137.hg19.vcf to be VCF INFO 17:57:32,961 ArgumentTypeDescriptor - Dynamically determined type of hapmap_3.3.hg19.vcf to be VCF INFO 17:57:32,962 ArgumentTypeDescriptor - Dynamically determined type of 1000G_omni2.5.hg19.vcf to be VCF INFO 17:57:32,963 ArgumentTypeDescriptor - Dynamically determined type of Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF INFO 17:57:32,964 ArgumentTypeDescriptor - Dynamically determined type of 1000G_phase1.indels.hg19.vcf to be VCF INFO 17:57:32,997 GenomeAnalysisEngine - Strictness is SILENT INFO 17:57:33,061 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 17:57:33,065 SAMDataSourceSAMReaders - Initializing SAMRecords in serial
INFO 17:57:33,076 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.01 INFO 17:57:33,084 RMDTrackBuilder - Loading Tribble index from disk for file dbsnp_137.hg19.vcf INFO 17:57:33,171 RMDTrackBuilder - Loading Tribble index from disk for file hapmap_3.3.hg19.vcf INFO 17:57:33,190 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_omni2.5.hg19.vcf INFO 17:57:33,204 RMDTrackBuilder - Loading Tribble index from disk for file Mills_and_1000G_gold_standard.indels.hg19.vcf INFO 17:57:33,222 RMDTrackBuilder - Loading Tribble index from disk for file 1000G_phase1.indels.hg19.vcf INFO 17:57:33,271 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 17:57:33,274 GenomeAnalysisEngine - Done creating shard strategy INFO 17:57:33,274 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 17:57:33,274 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 17:57:33,328 BaseRecalibrator - The covariates being used here: INFO 17:57:33,328 BaseRecalibrator - ReadGroupCovariate INFO 17:57:33,328 BaseRecalibrator - QualityScoreCovariate INFO 17:57:33,328 BaseRecalibrator - ContextCovariate INFO 17:57:33,328 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 17:57:33,329 BaseRecalibrator - CycleCovariate INFO 17:57:33,330 ReadShardBalancer$1 - Loading BAM index data for next contig

##### ERROR ------------------------------------------------------------------------------------------

I have used RealignerTargetCreator and IndelRealigner without any issues have gotten the correct output I need. But for some reason at the BaseRecalibrator step I am getting this error. If someone could please help me troubleshoot this.

Thanks,
Sinan

Geraldine Van der Auwera, PhD

• Posts: 17Member

@Geraldine_VdAuwera It is working properly now.

Thanks,
Sinan

• Posts: 17Member
edited May 2013

Hello again, for some reason when running BaseRecalibration I am getting zero processed reads which is quite interesting. Do you have any idea as why this is occuring, also I do get an output with zero recalibration information.

Here is my command:
java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf --validation_strictness STRICT -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp


running screen:
INFO  15:10:40,075 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:10:40,077 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO  15:10:40,077 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO  15:10:40,082 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov ContextCovariate -o recal_data.grp
INFO  15:10:40,082 HelpFormatter - Date/Time: 2013/05/14 15:10:40
INFO  15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:10:40,082 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:10:40,104 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF
INFO  15:10:40,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF
INFO  15:10:40,127 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF
INFO  15:10:42,750 GenomeAnalysisEngine - Strictness is SILENT
INFO  15:10:43,025 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO  15:10:43,033 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 15:10:43,051 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO  15:10:43,093 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf
INFO  15:10:43,407 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf
INFO  15:10:44,262 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf
INFO  15:10:44,588 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
INFO  15:10:44,599 GenomeAnalysisEngine - Done creating shard strategy
INFO  15:10:44,600 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO  15:10:44,730 BaseRecalibrator - The covariates being used here:
INFO  15:10:44,731 BaseRecalibrator -  QualityScoreCovariate
INFO  15:10:44,732 BaseRecalibrator -  ContextCovariate
INFO  15:10:44,732 ContextCovariate -   Context sizes: base substitution model 2, indel substitution model 3
INFO  15:10:44,733 BaseRecalibrator -  CycleCovariate
INFO  15:10:44,738 ReadShardBalancer$1 - Loading BAM index data for next contig INFO 15:10:44,741 ReadShardBalancer$1 - Done loading BAM index data for next contig
INFO  15:11:14,658 ProgressMeter -        Starting        0.00e+00   30.0 s       49.7 w    100.0%        30.0 s     0.0 s
INFO  15:11:44,662 ProgressMeter -        Starting        0.00e+00   60.0 s       99.3 w    100.0%        60.0 s     0.0 s
INFO  15:12:15,185 ProgressMeter -        Starting        0.00e+00   90.0 s      149.8 w    100.0%        90.0 s     0.0 s
INFO  15:12:45,186 ProgressMeter -        Starting        0.00e+00  120.0 s      199.4 w    100.0%       120.0 s     0.0 s
INFO  15:13:15,189 ProgressMeter -        Starting        0.00e+00    2.5 m      249.0 w    100.0%         2.5 m     0.0 s
INFO  15:13:45,191 ProgressMeter -        Starting        0.00e+00    3.0 m      298.6 w    100.0%         3.0 m     0.0 s
INFO  15:14:15,193 ProgressMeter -        Starting        0.00e+00    3.5 m      348.2 w    100.0%         3.5 m     0.0 s
INFO  15:14:44,687 ReadShardBalancer1 - Loading BAM index data for next contig INFO 15:14:44,692 BaseRecalibrator - Calculating quantized quality scores... INFO 15:14:45,195 ProgressMeter - Starting 0.00e+00 4.0 m 397.8 w 100.0% 4.0 m 0.0 s INFO 15:14:45,576 BaseRecalibrator - Writing recalibration report... INFO 15:14:46,197 BaseRecalibrator - ...done! **INFO 15:14:46,200 BaseRecalibrator - Processed: 0 reads** INFO 15:14:46,209 ProgressMeter - done 0.00e+00 4.0 m 399.5 w 100.0% 4.0 m 0.0 s INFO 15:14:46,216 ProgressMeter - Total runtime 241.62 secs, 4.03 min, 0.07 hours INFO 15:14:47,683 GATKRunReport - Uploaded run statistics report to AWS S3  and output file information in it: # :GATKReport.v1.1:5 # :GATKTable:2:18:%s:%s:; # :GATKTable:Arguments:Recalibration argument collection values used in this run Argument Value binary_tag_name null covariate ReadGroupCovariate,QualityScoreCovariate,ContextCovariate,CycleCovariate default_platform null deletions_default_quality 45 force_platform null indels_context_size 3 insertions_default_quality 45 low_quality_tail 2 maximum_cycle_value 500 mismatches_context_size 2 mismatches_default_quality -1 no_standard_covs false plot_pdf_file null quantizing_levels 16 recalibration_report null run_without_dbsnp false solid_nocall_strategy THROW_EXCEPTION solid_recal_mode SET_Q_ZERO # :GATKTable:3:94:%s:%s:%s:; # :GATKTable:Quantized:Quality quantization map QualityScore Count QuantizedScore 0 0 93 1 0 93 2 0 93 3 0 93 4 0 93 5 0 93 6 0 93 7 0 93 8 0 93 9 0 93 10 0 93 11 0 93 12 0 93 13 0 93 14 0 93 15 0 93 16 0 93 17 0 93 18 0 93 19 0 93 20 0 93 21 0 93 22 0 93 23 0 93 24 0 93 25 0 93 26 0 93 27 0 93 28 0 93 29 0 93 30 0 93 31 0 93 32 0 93 33 0 93 34 0 93 35 0 93 36 0 93 37 0 93 38 0 93 39 0 93 40 0 93 41 0 93 42 0 93 43 0 93 44 0 93 45 0 93 46 0 93 47 0 93 48 0 93 49 0 93 50 0 93 51 0 93 52 0 93 53 0 93 54 0 93 55 0 93 56 0 93 57 0 93 58 0 93 59 0 93 60 0 93 61 0 93 62 0 93 63 0 93 64 0 93 65 0 93 66 0 93 67 0 93 68 0 93 69 0 93 70 0 93 71 0 93 72 0 93 73 0 93 74 0 93 75 0 93 76 0 93 77 0 93 78 0 93 79 0 79 80 0 80 81 0 81 82 0 82 83 0 83 84 0 84 85 0 85 86 0 86 87 0 87 88 0 88 89 0 89 90 0 90 91 0 91 92 0 92 93 0 93 # :GATKTable:6:0:%s:%s:%.4f:%.4f:%d:%.2f:; # :GATKTable:RecalTable0: ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors # :GATKTable:6:0:%s:%s:%s:%.4f:%d:%.2f:; # :GATKTable:RecalTable1: ReadGroup QualityScore EventType EmpiricalQuality Observations Errors # :GATKTable:8:0:%s:%s:%s:%s:%s:%.4f:%d:%.2f:; # :GATKTable:RecalTable2: ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors I do apologize for the long post in the forum. I just dont understand why no Errors are being given as well and no recalibration is being processed. Thanks, Sinan Post edited by Mark_DePristo on • Posts: 274Administrator, GATK Dev admin can you check if your BAM file has any reads? Sounds silly but it could be something as simple as that. Also you don't need to specify the -cov parameters. Those are the default covariates and if you specify them like that, I am afraid it may be confusing the tool. Can you remove those parameters and check if it works? (I'll issue a bug report if that's the case) • Posts: 17Member I know the bam file is not empty because for the IndelRealigner process I had to have the quality scores fixed by using -fixMisencodedQuals and I cross checked it with original bam file to see if the scores were actually adjusted accordingly. Unfortunately I got the same output Command Line Code: java -Xmx8g -jar /home/sir2013/GATK/GenomeAnalysisTK.jar -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp Running Script Output: INFO 11:20:21,953 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:21,964 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 INFO 11:20:21,965 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 11:20:21,965 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 11:20:21,978 HelpFormatter - Program Args: -T BaseRecalibrator -I 1024_D_realignedBam.bam -R /pbtech_mounts/fdlab_store003/fdlab/genomes/human/hg19/indexes/star/hg19.fa -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf -knownSites /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf -o recal_data.grp INFO 11:20:21,979 HelpFormatter - Date/Time: 2013/05/16 11:20:21 INFO 11:20:21,980 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:21,980 HelpFormatter - -------------------------------------------------------------------------------- INFO 11:20:22,072 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf to be VCF INFO 11:20:22,090 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF INFO 11:20:22,115 ArgumentTypeDescriptor - Dynamically determined type of /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf to be VCF INFO 11:20:23,555 GenomeAnalysisEngine - Strictness is SILENT INFO 11:20:23,845 GenomeAnalysisEngine - Downsampling Settings: No downsampling INFO 11:20:23,852 SAMDataSourceSAMReaders - Initializing SAMRecords in serial
INFO 11:20:23,899 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.04 INFO 11:20:23,947 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/dbsnp_137.hg19.vcf INFO 11:20:24,363 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/Mills_and_1000G_gold_standard.indels.hg19.vcf INFO 11:20:24,653 RMDTrackBuilder - Loading Tribble index from disk for file /pbtech_mounts/homesA/asboner/asboner_scratch/hg19/prostate_samples/resources/1000G_phase1.indels.hg19.vcf INFO 11:20:26,702 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 11:20:26,721 GenomeAnalysisEngine - Done creating shard strategy INFO 11:20:26,722 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 11:20:26,723 ProgressMeter - Location processed.reads runtime per.1M.reads completed total.runtime remaining INFO 11:20:27,050 BaseRecalibrator - The covariates being used here: INFO 11:20:27,051 BaseRecalibrator - ReadGroupCovariate INFO 11:20:27,052 BaseRecalibrator - QualityScoreCovariate INFO 11:20:27,052 BaseRecalibrator - ContextCovariate INFO 11:20:27,053 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3 INFO 11:20:27,054 BaseRecalibrator - CycleCovariate INFO 11:20:27,064 ReadShardBalancer$1 - Loading BAM index data for next contig
INFO 11:20:27,068 ReadShardBalancer$1 - Done loading BAM index data for next contig INFO 11:20:56,840 ProgressMeter - Starting 0.00e+00 30.0 s 49.8 w 100.0% 30.0 s 0.0 s INFO 11:21:26,842 ProgressMeter - Starting 0.00e+00 60.0 s 99.4 w 100.0% 60.0 s 0.0 s INFO 11:21:56,844 ProgressMeter - Starting 0.00e+00 90.0 s 149.0 w 100.0% 90.0 s 0.0 s INFO 11:22:26,846 ProgressMeter - Starting 0.00e+00 120.0 s 198.6 w 100.0% 120.0 s 0.0 s INFO 11:22:49,994 ReadShardBalancer$1 - Loading BAM index data for next contig
INFO 11:22:49,997 BaseRecalibrator - Calculating quantized quality scores...
INFO 11:22:50,101 BaseRecalibrator - Writing recalibration report...
INFO 11:22:50,151 BaseRecalibrator - ...done!
INFO 11:22:50,151 BaseRecalibrator - Processed: 0 reads
INFO 11:22:50,153 ProgressMeter - done 0.00e+00 2.4 m 237.2 w 100.0% 2.4 m 0.0 s
INFO 11:22:50,154 ProgressMeter - Total runtime 143.43 secs, 2.39 min, 0.04 hours
INFO 11:22:51,209 GATKRunReport - Uploaded run statistics report to AWS S3

Information in output file recal_data.grp:

# :GATKTable:Arguments:Recalibration argument collection values used in this run

Argument Value
binary_tag_name null
default_platform null
deletions_default_quality 45
force_platform null
indels_context_size 3
insertions_default_quality 45
low_quality_tail 2
maximum_cycle_value 500
mismatches_context_size 2
mismatches_default_quality -1
no_standard_covs false
plot_pdf_file null
quantizing_levels 16
recalibration_report null
run_without_dbsnp false
solid_nocall_strategy THROW_EXCEPTION
solid_recal_mode SET_Q_ZERO

# :GATKTable:Quantized:Quality quantization map

QualityScore Count QuantizedScore
0 0 93
1 0 93
2 0 93
3 0 93
4 0 93
5 0 93
6 0 93
7 0 93
8 0 93
9 0 93
10 0 93
11 0 93
12 0 93
13 0 93
14 0 93
15 0 93
16 0 93
17 0 93
18 0 93
19 0 93
20 0 93
21 0 93
22 0 93
23 0 93
24 0 93
25 0 93
26 0 93
27 0 93
28 0 93
29 0 93
30 0 93
31 0 93
32 0 93
33 0 93
34 0 93
35 0 93
36 0 93
37 0 93
38 0 93
39 0 93
40 0 93
41 0 93
42 0 93
43 0 93
44 0 93
45 0 93
46 0 93
47 0 93
48 0 93
49 0 93
50 0 93
51 0 93
52 0 93
53 0 93
54 0 93
55 0 93
56 0 93
57 0 93
58 0 93
59 0 93
60 0 93
61 0 93
62 0 93
63 0 93
64 0 93
65 0 93
66 0 93
67 0 93
68 0 93
69 0 93
70 0 93
71 0 93
72 0 93
73 0 93
74 0 93
75 0 93
76 0 93
77 0 93
78 0 93
79 0 79
80 0 80
81 0 81
82 0 82
83 0 83
84 0 84
85 0 85
86 0 86
87 0 87
88 0 88
89 0 89
90 0 90
91 0 91
92 0 92
93 0 93

# :GATKTable:RecalTable0:

ReadGroup EventType EmpiricalQuality EstimatedQReported Observations Errors

# :GATKTable:RecalTable1:

ReadGroup QualityScore EventType EmpiricalQuality Observations Errors

# :GATKTable:RecalTable2:

ReadGroup QualityScore CovariateValue CovariateName EventType EmpiricalQuality Observations Errors

Thanks,
Sinan

this is very strange. How big is your BAM file? Can you share it for us to debug this ?

• Posts: 17Member

Sure I can share the bam file. Question is, how would I do that? I have used filezilla to download the bundle pack you have. Is there a specific folder I should put in there and how would you like it name to distinguish it?

Thanks,
Sinan

edited May 2013

You can upload it to our FTP server. Instructions are here. Just let me know when you have done so and we will start debugging it internally.

Thank you very much.

Post edited by Carneiro on
• Posts: 17Member

Sorry the bam file is about 2.1G

If you can reproduce the error with a tiny version of your BAM file (which you can create with PrintReads using -L ) then you can just attach your file to this thread, which is optimal.

• Posts: 17Member

I am sorry I have not gotten to the printreads step yet when you say use -L is there an input for that argument? if you give me an example so I can attach the file

Thanks,
Sinan

nevermind, just upload the whole file. 2.1G is fairly small.

• Posts: 17Member

Ok, it seems to be taking forever for the uploading it has been saying "uploading" for the past 4 hours. Is there another way to get this to you.

Thanks,
Sinan

If you have any place to put it, we can download it from our end. But the FTP is the preferred method.

• Posts: 17Member

I created a folder under my name "Sinan" and I uploaded on the FTP for uploads. There you will see 1024_D_realigned.bam, this bam file has already successfully gone through the RealTargetCreator and IndelRealigner. I do hope to hear some good news because I tired running other bam files which were unsuccessful 0 reads processed again.

Thanks,
Sinan

Thanks we will take a look.

• Posts: 17Member

Hello, I was wondering if there was any update or if a solution has been found to my problem.

Thanks,
Sinan

It seems like your BAM file has MQ 255 reads, that's why they're all being filtered out.

Yes, the newest GATK will print a more informative message on this problem. It will also be possible to fix by adding -rf ReassignMappingQuality to the command line. Note this will only work in the nightly build and will come out with GATK 2.6

--
Mark A. DePristo, Ph.D.
Co-Director, Medical and Population Genetics
Broad Institute of MIT and Harvard

• Posts: 17Member

Should this have been fixed when I specify -fixMisencodedQuals while doing the IndelRealigner? I checked the output of the new bam to the old bam and I could see the adjustments had been made

Thanks,
Sinan

Hi Sinan,

-fixMisencodedQuals is meant to fix a different issue which concerns base qualities, not mapping qualities (see release highlights for 2.3 for more details). Are you still having problems?

Geraldine Van der Auwera, PhD

• Posts: 17Member

Hello,

I thought I would bring this to your attention regarding the MQ255. I use Star to run my alignments and just as tophat, star has the same MQ annotation.

255 = uniquely mapped
3 = maps to 2 locations
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = 10 or more locations

So as you can see there is no score actually being assigned for MQ but bwa does give an actually scoring. I was wondering if there is conversion for all 5 scores other then 255 being converted to 60. So that I can proper processes my data through GATK tools

Thanks,
Sinan

Hi Sinan,

The GATK will only consider uniquely mapped reads, so converting the MQ 255 values is the only step necessary. The other reads will be ignored.

Geraldine Van der Auwera, PhD

• Posts: 17Member

Ok, thank you very much for all the help I have finally got the BaseRecalibration step to work cheers!

Sinan