Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

BaseRecalibrator Error

bruce01bruce01 Posts: 16Member
edited September 2012 in Ask the GATK team

Hi, I am getting an error (no info given about causes unfortunately) following running BaseRecalibrator:

java -Xmx4g -jar $tool/GenomeAnalysisTK.jar \ 
    -T BaseRecalibrator \ 
    -I $bwa/BAM/s_1.rmdup_readgps.bam \
        -R $bin/Bos_taurus.UMD3.1.66.fa \
        -knownSites $bin/Bos_taurus_UMD_3.1.DBSNP.zero.ordered.bed \
        -o $gatk/recal_rea/recal_data1.grp

I get output to screen of all chromosomes, positions etc followed by the error

chrX_dna:chromosome_chromosome:UMD3.1:X:1:148823899:1, chr1_dna:chromosome_chromosome:UMD3.1:1:1:158337067:1 #####ERROR------------------------------

Can you suggest any reasons for BaseRecalibrator giving up here? I understand the BED file is 0-based but it has been used successfully in the previous incarnation of BaseRecalibrator. I have tried the knownSites:mask,BED and it has no effect. I have all necessary readgroup info and index for BAM, and indexed BED.

Your help is much appreciated.

Post edited by Geraldine_VdAuwera on

Best Answer

  • ebanksebanks Posts: 678 mod
    Answer ✓

    Hi Bruce,

    What happens when you run with the test files from our bundle? Basically, I think you are encountering a UserError but somehow the error messages are not being printed correctly on your machine. I'm not sure how/why this could be (), but it's almost certainly an issue on your end that we won't be able to help you through unfortunately.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

Answers

  • Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Developer admin

    What's the full error message?

    -- Mark A. DePristo, Ph.D. Co-Director, Medical and Population Genetics Broad Institute of MIT and Harvard

  • bruce01bruce01 Posts: 16Member
    edited September 2012

    Hi Mark,

    it is literally as I have it above: .#####ERROR------------------------------
    and then it quits to cmd line.

    Bruce.

    Post edited by bruce01 on
  • ebanksebanks Posts: 678GATK Developer mod

    Hey Bruce,

    Can you please post the entire output (not just the error) of running this command (including the startup logging)?

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • bruce01bruce01 Posts: 16Member
    edited September 2012

    Hi Eric,

    the start log:
    INFO 07:49:19,854 HelpFormatter - --------------------------------------------------------------------------------
    INFO 07:49:19,857 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.1-8-g5efb575, Compiled 2012/08/30 14:22:17
    INFO 07:49:19,857 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 07:49:19,858 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 07:49:19,859 HelpFormatter - Program Args: -T BaseRecalibrator -I /home/bmoran/data/endo/BWA/BAM/s_1.rmdup_readgps.bam -R /home/bmoran/bin/Bos_taurus.UMD3.1.66.fa -knownSites /home/bmoran/bin/Bos_taurus_UMD_3.1.DBSNP.zero.ordered.bed -o /home/bmoran/data/endo/GATK/recal_rea/recal_data1.grp
    INFO 07:49:19,859 HelpFormatter - Date/Time: 2012/09/07 07:49:19
    INFO 07:49:19,859 HelpFormatter - --------------------------------------------------------------------------------
    INFO 07:49:19,859 HelpFormatter - --------------------------------------------------------------------------------
    INFO 07:49:19,885 ArgumentTypeDescriptor - Dynamically determined type of /home/bmoran/bin/Bos_taurus_UMD_3.1.DBSNP.zero.ordered.bed to be BED
    INFO 07:49:19,898 GenomeAnalysisEngine - Strictness is SILENT
    INFO 07:49:20,563 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 07:49:20,708 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.14
    INFO 07:49:20,738 RMDTrackBuilder - Loading Tribble index from disk for file /home/bmoran/bin/Bos_taurus_UMD_3.1.DBSNP.zero.ordered.bed

    Then its spits out a lot of positions from the BED file I have (this is a very small sample):

    chrGJ060140.1_dna:scaffold_scaffold:UMD3.1:GJ060140.1:1:10738:1, chrGJ059471.1_dna:scaffold_scaffold:UMD3.1:GJ059471.1:1:10900:1, chrGJ059939.1_dna:scaffold_scaffold:UMD3.1:GJ059939.1:1:10915:1, chrGJ059912.1_dna:scaffold_scaffold:UMD3.1:GJ059912.1:1:10974:1, chrGJ059893.1_dna:scaffold_scaffold:UMD3.1:GJ059893.1:1:11012:1, chrGJ059957.1_dna:scaffold_scaffold:UMD3.1:GJ059957.1:1:11037:1



    then:
    .##### ERROR --------------------------------------------------

    I have run other commands in GATK and errors are usually quite helpful but this has me stumped. I am trying to use a homemade VCF converted from flat files from Ensembl so fingers crossed I get it working.

    Thanks for your help,

    Bruce.

    Post edited by bruce01 on
  • bruce01bruce01 Posts: 16Member
    edited September 2012

    It is now doing the same thing from my homemade VCF (parsed from ds_flat files). I used vcf-concat to concatenate sorted per-chromosome VCFs. I index with IGVtools but I then get the output

    INFO 13:45:33,064 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:45:33,067 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.1-8-g5efb575, Compiled 2012/08/30 14:22:17
    INFO 13:45:33,067 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO 13:45:33,067 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 13:45:33,068 HelpFormatter - Program Args: -I /home/bmoran/data/endo/BWA/BAM/s_1.rmdup_readgps.bam -R /home/bmoran/bin/Bos_taurus.UMD3.1.66.fa -T RealignerTargetCreator -o /home/bmoran/data/endo/GATK/rea_recal/s_1.forIndelRealigner.intervals --known /home/bmoran/bin/snps/flat.all.sorted.vcf
    INFO 13:45:33,068 HelpFormatter - Date/Time: 2012/09/07 13:45:33
    INFO 13:45:33,069 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:45:33,069 HelpFormatter - --------------------------------------------------------------------------------
    INFO 13:45:33,081 ArgumentTypeDescriptor - Dynamically determined type of /home/bmoran/bin/snps/flat.all.sorted.vcf to be VCF
    INFO 13:45:33,085 GenomeAnalysisEngine - Strictness is SILENT INFO 13:45:33,524 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO 13:45:33,657 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.13
    INFO 13:45:33,683 RMDTrackBuilder - Loading Tribble index from disk for file /home/bmoran/bin/snps/flat.all.sorted.vcf WARN 13:45:33,750 RMDTrackBuilder - Index file /home/bmoran/bin/snps/flat.all.sorted.vcf.idx is out of date (old version), deleting and updating the index file
    INFO 13:45:33,751 RMDTrackBuilder - Creating Tribble index in memory for file /home/bmoran/bin/snps/flat.all.sorted.vcf

    following which it spits out positions and ends with the error as above.

    Bruce.

    Post edited by bruce01 on
  • bruce01bruce01 Posts: 16Member

    Hi Eric,

    undoubtedly it is an error on my end having had to make my own VCF file following the failure of a BED file previously used successfully on another version of your softwares. I will look at your bundles VCFs and determine what might be my issue.

    Many thanks for the support already,

    Bruce.

  • bruce01bruce01 Posts: 16Member

    Just as a follow-up I had an errant contig in my reference fasta not in my VCF, so the error was telling me that.

Sign In or Register to comment.