It looks like you're new here. If you want to get involved, click one of these buttons!
I'm trying to run the BaseRecalibrator tool on my data and am getting the following error:
INFO 14:58:17,399 HelpFormatter - --------------------------------------------------------------------------------- [33/222]
INFO 14:58:17,400 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.1-13-g1706365, Compiled 2012/10/12 19:21:06
INFO 14:58:17,400 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 14:58:17,400 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 14:58:17,401 HelpFormatter - Program Args: -T BaseRecalibrator -I /home/sheenams/gatk_test/LMG-206.GATKinitialrmdup.srt.bam -R /home/genetics/G
enomes/gatk-bundle/human_g1k_v37.fasta -knownSites /home/genetics/Genomes/gatk-bundle/dbsnp_135.b37.vcf -knownSites /home/genetics/Genomes/gatk-bundl
e/Mills_and_1000G_gold_standard.indels.b37.sites.vcf -knownSites /home/genetics/Genomes/gatk-bundle/1000G_phase1.indels.b37.vcf -o /home/sheenams/gat
k_test/LMG-206.recal_data.csv -log /home/sheenams/gatk_test/LMG-206.gatk_log
INFO 14:58:17,401 HelpFormatter - Date/Time: 2012/10/17 14:58:17
INFO 14:58:17,401 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:58:17,401 HelpFormatter - ---------------------------------------------------------------------------------
INFO 14:58:17,407 ArgumentTypeDescriptor - Dynamically determined type of /home/genetics/Genomes/gatk-bundle/dbsnp_135.b37.vcf to be VCF
INFO 14:58:17,409 ArgumentTypeDescriptor - Dynamically determined type of /home/genetics/Genomes/gatk-bundle/Mills_and_1000G_gold_standard.indels.b3
7.sites.vcf to be VCF
INFO 14:58:17,410 ArgumentTypeDescriptor - Dynamically determined type of /home/genetics/Genomes/gatk-bundle/1000G_phase1.indels.b37.vcf to be VCF
INFO 14:58:17,414 GenomeAnalysisEngine - Strictness is SILENT
INFO 14:58:17,463 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 14:58:17,479 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 14:58:17,487 RMDTrackBuilder - Loading Tribble index from disk for file /home/genetics/Genomes/gatk-bundle/dbsnp_135.b37.vcf
WARN 14:58:17,574 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has UNBOUND
ED but standard is A
INFO 14:58:17,575 RMDTrackBuilder - Loading Tribble index from disk for file /home/genetics/Genomes/gatk-bundle/Mills_and_1000G_gold_standard.indels
.b37.sites.vcf
WARN 14:58:17,589 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but stan
dard is Integer
INFO 14:58:17,590 RMDTrackBuilder - Loading Tribble index from disk for file /home/genetics/Genomes/gatk-bundle/1000G_phase1.indels.b37.vcf
WARN 14:58:17,603 VCFHeader - Found GL format, but no PL field. As the GATK now only manages PL fields internally automatically adding a correspond
ing PL field to your VCF header
WARN 14:58:17,603 VCFStandardHeaderLines$Standards - Repairing standard header line for field AC because -- count types disagree; header has UNBOUND
ED but standard is A -- descriptions disagree; header has 'Alternate Allele Count' but standard is 'Allele count in genotypes, for each ALT allele, i
n the same order as listed'
WARN 14:58:17,603 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has INTEGER
but standard is A -- descriptions disagree; header has 'Global Allele Frequency based on AC/AN' but standard is 'Allele Frequency, for each ALT alle
le, in the same order as listed'
INFO 14:58:18,093 BaseRecalibrator - The covariates being used here:
INFO 14:58:18,093 BaseRecalibrator - ReadGroupCovariate
INFO 14:58:18,093 BaseRecalibrator - QualityScoreCovariate
INFO 14:58:18,094 BaseRecalibrator - ContextCovariate
INFO 14:58:18,094 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
INFO 14:58:18,094 BaseRecalibrator - CycleCovariate
INFO 14:58:18,136 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL STARTING]
INFO 14:58:18,137 TraversalEngine - Location processed.sites runtime per.1M.sites completed total.runtime remaining
INFO 14:58:35,886 GATKRunReport - Uploaded run statistics report to AWS S3
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Key 2002 is too large for dimension 2 (max is 2001) at org.broadinstitute.sting.utils.collections.NestedIntegerArray.put(NestedIntegerArray.java:77) at org.broadinstitute.sting.gatk.walkers.bqsr.AdvancedRecalibrationEngine.updateDataForPileupElement(AdvancedRecalibrationEngine.java:97) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:244) at org.broadinstitute.sting.gatk.walkers.bqsr.BaseRecalibrator.map(BaseRecalibrator.java:106) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:65) at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:62) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:265) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:93)
I didn't see any other questions in the forum that addressed this. Can you please guide me on how to fix this error? I'm running GATK 2.1.13.
Thanks,
Sheena
ebanks
Posts: 475 mod
Answers
I think I found the mistake causing this error. I had attempted to use ReducedReads before the realigner. Using it later in the pipeline got rid of the error. Thanks
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Hi Eric
I got a similar error with GenomeAnalysisTK-2.2-2-gf44cc4e 's Base Recalirator. I also ran the picard's validatesamfile and it says NO ERRORs. The GATK error was ->##### ERROR MESSAGE: Key 2006 is too large for dimension 2 (max is 2001)
What exactly does this error mean? what key is it talking about? And how can I fix it?
Ashu
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •