What I need to do for the error message about running HaplotypeCaller?

Dear Sir,

I listed the message which I ran HaplotypeCaller. Could you give me some suggestions? Thank you for kind help.

INFO 16:39:53,927 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:39:53,931 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-0-g7e26428, Compiled 2015/05/15 03:25:41
INFO 16:39:53,931 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 16:39:53,932 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 16:39:53,938 HelpFormatter - Program Args: -T HaplotypeCaller -R /home/analysis/RNA-SEQ/DATABASE/Human/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -I HP32N-sort-RG.bam -fixMisencodedQuals -stand_call_conf 20 -stand_emit_conf 20 -o HP32N-GATK.vcf
INFO 16:39:53,943 HelpFormatter - Executing as [email protected] on Linux 3.8.0-44-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_80-b15.
INFO 16:39:53,943 HelpFormatter - Date/Time: 2015/06/17 16:39:53
INFO 16:39:53,944 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:39:53,945 HelpFormatter - --------------------------------------------------------------------------------
INFO 16:39:54,895 GenomeAnalysisEngine - Strictness is SILENT
INFO 16:39:55,092 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500
INFO 16:39:55,105 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 16:39:55,204 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.10
INFO 16:39:55,214 HCMappingQualityFilter - Filtering out reads with MAPQ < 20
INFO 16:39:55,349 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 16:39:56,406 GenomeAnalysisEngine - Done preparing for traversal
INFO 16:39:56,410 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 16:39:56,411 ProgressMeter - Location | active regions | elapsed | active regions | completed | runtime | runtime
INFO 16:39:56,412 HaplotypeCaller - Disabling physical phasing, which is supported only for reference-model confidence output
INFO 16:39:56,655 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values.
INFO 16:39:56,656 StrandBiasTest - SAM/BAM data was found. Attempting to use read data to calculate strand bias annotations values.
INFO 16:39:56,682 HaplotypeCaller - Using global mismapping rate of 45 => -4.5 in log10 likelihood units
INFO 16:40:02,485 GATKRunReport - Uploaded run statistics report to AWS S3
**##### ERROR ------------------------------------------------------------------------------------------

ERROR stack trace

java.lang.IllegalStateException: Rod span chr1:1-249250621 isn't contained within the data shard chr1:1-249250621, meaning we wouldn't get all of the data we need
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions$ActiveRegionIterator.(TraverseActiveRegions.java:307)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:271)
at org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions.traverse(TraverseActiveRegions.java:78)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:99)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:315)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.4-0-g7e26428):
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR MESSAGE: Rod span chr1:1-249250621 isn't contained within the data shard chr1:1-249250621, meaning we wouldn't get all of the data we need**


  • SheilaSheila Broad InstituteMember, Broadie admin


    Can you please post your bam header? I am wondering if your error is related to this thread: http://gatkforums.broadinstitute.org/discussion/5067/haplotypecaller-rod-span-isnt-contained-within-the-data-shard-without-nct-option


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hang on -- @motloe, why are you running HC with -fixMisencodedQuals ? This suggests to me you didn't apply our Best Practices for pre-processing your data and there might be something wrong with it. Run Picard ValidateSamFile on your bam to check that it's valid.

  • motloemotloe Member

    Dear Sir,

    The header
    @HD VN:1.4 SO:coordinate
    @SQ SN:chr1 LN:249250621
    @SQ SN:chr2 LN:243199373
    @SQ SN:chr3 LN:198022430
    @SQ SN:chr4 LN:191154276
    @SQ SN:chr5 LN:180915260
    @SQ SN:chr6 LN:171115067
    @SQ SN:chr7 LN:159138663
    @SQ SN:chr8 LN:146364022
    @SQ SN:chr9 LN:141213431
    @SQ SN:chr10 LN:135534747
    @SQ SN:chr11 LN:135006516
    @SQ SN:chr12 LN:133851895
    @SQ SN:chr13 LN:115169878
    @SQ SN:chr14 LN:107349540
    @SQ SN:chr15 LN:102531392
    @SQ SN:chr16 LN:90354753
    @SQ SN:chr17 LN:81195210
    @SQ SN:chr18 LN:78077248
    @SQ SN:chr19 LN:59128983
    @SQ SN:chr20 LN:63025520
    @SQ SN:chr21 LN:48129895
    @SQ SN:chr22 LN:51304566
    @SQ SN:chrX LN:155270560
    @SQ SN:chrY LN:59373566
    @RG ID:1 PL:illumina PU:HP32N LB:HP32N SM:HP32N
    @PG ID:bowtie2 PN:bowtie2 VN:2.1.0

    I also tried "ValidateSamFile " . There are no errors found. Running messages listed as follows

    [Thu Jun 18 06:51:20 CST 2015] Executing as [email protected] on Linux 3.8.0-44-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_80-b15; Picard version: 1.119(d44cdb51745f5e8075c826430a39d8a61f1dd832_1408991805) JdkDeflater
    INFO 2015-06-18 06:52:32 SamFileValidator Validated Read 10,000,000 records. Elapsed time: 00:01:12s. Time for last 10,000,000: 72s. Last read position: chr3:182,638,456
    INFO 2015-06-18 06:53:43 SamFileValidator Validated Read 20,000,000 records. Elapsed time: 00:02:23s. Time for last 10,000,000: 70s. Last read position: chr10:29,933,843
    INFO 2015-06-18 06:54:51 SamFileValidator Validated Read 30,000,000 records. Elapsed time: 00:03:31s. Time for last 10,000,000: 67s. Last read position: chr18:29,058,248
    [Thu Jun 18 06:56:03 CST 2015] picard.sam.ValidateSamFile done. Elapsed time: 4.72 minutes.

    Thank for you kind help.

  • SheilaSheila Broad InstituteMember, Broadie admin


    You can try adding the chrM to your bam file header and seeing if that works. It worked for this user: http://gatkforums.broadinstitute.org/discussion/5067/haplotypecaller-rod-span-isnt-contained-within-the-data-shard-without-nct-option


