We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

RNA-SeQC error: GenomeLocParser are incorrect:

Hi,

I am trying to run RNA-SeQC for alignment of reads against Zebrafish reference using this command:

java -Xmx24g -jar /share/pkg/RNA-SeQC/1.1.7/RNA-SeQC_v1.1.7.jar -r danRer7.fa -t danRer7/ucsc_new.gtf
-n 1000 -s 'sample1|tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam|RNASEQC analysis' -o metrics

After running for a while I get this error:
Running GATK Depth of Coverage Analysis ....
Arguments: -T DepthOfCoverage -R /home/sg15w/WES/Coel/BWAIndex/danRer7.fa -I /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam -o _metrics/sample1/highexpr//perBaseDoC.out -L _metrics/sample1/highexpr/intervals.list -l ERROR
Arguments Array: [-T, DepthOfCoverage, -R, /home/sg15w/WES/Coel/BWAIndex/danRer7.fa, -I, /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam, -o, _metrics/sample1/highexpr//perBaseDoC.out, -L, _metrics/sample1/highexpr/intervals.list, -l, ERROR]
org.broadinstitute.sting.utils.exceptions.UserException$MalformedGenomeLoc: Badly formed genome loc: Parameters to GenomeLocParser are incorrect:The genome loc coordinates 62098192-62098475 exceed the contig size (59938731)
at org.broadinstitute.sting.utils.GenomeLocParser.vglHelper(GenomeLocParser.java:324)
at org.broadinstitute.sting.utils.GenomeLocParser.validateGenomeLoc(GenomeLocParser.java:307)
at org.broadinstitute.sting.utils.GenomeLocParser.createGenomeLoc(GenomeLocParser.java:265)
at org.broadinstitute.sting.utils.GenomeLocParser.parseGenomeLoc(GenomeLocParser.java:389)
at org.broadinstitute.sting.utils.interval.IntervalUtils.intervalFileToList(IntervalUtils.java:139)
at org.broadinstitute.sting.utils.interval.IntervalUtils.parseIntervalArguments(IntervalUtils.java:71)
at org.broadinstitute.sting.commandline.IntervalBinding.getIntervals(IntervalBinding.java:106)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.loadIntervals(GenomeAnalysisEngine.java:616)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeIntervals(GenomeAnalysisEngine.java:583)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:233)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
at org.broadinstitute.cga.rnaseq.gatk.GATKTools.runDoC(GATKTools.java:59)
at org.broadinstitute.cga.rnaseq.PerBaseDoC.runDoC(PerBaseDoC.java:888)
at org.broadinstitute.cga.rnaseq.PerBaseDoC.runDoC(PerBaseDoC.java:858)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.runMetrics(RNASeqMetrics.java:264)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:166)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:135)
RNA-SeQC Total Runtime: 115 min

Could you please help me fix this error.

Thanks
Sharvari

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Can you please post the full command line and GATK log output? This is truncated and doesn't include all the information we need.

  • sgujjasgujja Member

    Hello Geraldine,

    Here's the complete LSF output:

    The output (if any) follows:

    RNA-SeQC 1.1.7 is located as /share/pkg/RNA-SeQC/1.1.7/RNA-SeQC_v1.1.7.jar
    RNA-SeQC v1.1.7 05/14/12
    Creating rRNA Interval List based on given GTF annotations
    Retriving contig names from reference
    contig names in reference: 1133
    Loading GTF for Read Counting
    Converting to refGene
    Transcript objects to RefGen format: 1 s
    java.lang.RuntimeException: No rRNA found in GTF transcript_type field
    at org.broadinstitute.cga.rnaseq.TranscriptList.toRRNAIntervalList(TranscriptList.java:414)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.createRefGeneAndRRNAFiles(RNASeqMetrics.java:1288)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.prepareFiles(RNASeqMetrics.java:191)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:165)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:135)
    No information for rRNA available. Continuing without rRNA calculations. (Using the -BWArRNA flag for best results)
    Running IntronicExpressionReadBlock Walker ....
    Arguments: [-T, IntronicExpressionReadBlock, --outfile_metrics, _metrics/sample1/sample1.metrics.tmp.txt, -R, /home/sg15w/WES/Coel/BWAIndex/danRer7.fa, -I, /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam, -refseq, _metrics/refGene.txt, -l, ERROR]
    Warning, bad boundary: NM_001009916
    Finished writing _metrics/sample1/sample1.metrics.tmp.txt.intronReport.txt
    Finished writing _metrics/sample1/sample1.metrics.tmp.txt.intronReport.txt_intronOnly.txt, now creating RPKM values for introns ..
    GATK command result code: 0
    ... GATK CoutReadMetrics Analysis DONE
    CountReadMetricsWalker Runtime: 80 min
    Calculating library complexity for sample1
    Libary Complexity Calculation Time: 1212 s
    Stratifying Transcripts By Expression
    Number of expressed transcripts at this cuttoff: 13163
    Expression file for DoC: _metrics/sample1/lowexpr/sample1.transcripts.list
    Writing DoC per gene into: /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/_metrics/sample1/lowexpr
    Loading transcripts
    Preparing intervals for 1000 transcripts
    Interval Loading: 2 s
    Creating interval list
    Writing intervals from transcript objects
    Transcript objects to interval list conversion: 0 s
    Expression file for DoC: _metrics/sample1/medexpr/sample1.transcripts.list
    Writing DoC per gene into: /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/_metrics/sample1/medexpr
    Loading transcripts
    Preparing intervals for 1000 transcripts
    Interval Loading: 1 s
    Creating interval list
    Writing intervals from transcript objects
    Transcript objects to interval list conversion: 0 s
    Expression file for DoC: _metrics/sample1/highexpr/sample1.transcripts.list
    Writing DoC per gene into: /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/_metrics/sample1/highexpr
    Loading transcripts
    Preparing intervals for 1000 transcripts
    Interval Loading: 1 s
    Creating interval list
    Writing intervals from transcript objects
    Transcript objects to interval list conversion: 0 s
    Running GATK Depth of Coverage Analysis ....
    Arguments: -T DepthOfCoverage -R /home/sg15w/WES/Coel/BWAIndex/danRer7.fa -I /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam -o _metrics/sample1/lowexpr//perBaseDoC.out -L _metrics/sample1/lowexpr/intervals.list -l ERROR
    Arguments Array: [-T, DepthOfCoverage, -R, /home/sg15w/WES/Coel/BWAIndex/danRer7.fa, -I, /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam, -o, _metrics/sample1/lowexpr//perBaseDoC.out, -L, _metrics/sample1/lowexpr/intervals.list, -l, ERROR]
    GATK command result code: 0
    Depth of Coverage run time: 5 min
    ... GATK Depth of Coverage Analysis DONE
    Running GATK Depth of Coverage Analysis ....
    Arguments: -T DepthOfCoverage -R /home/sg15w/WES/Coel/BWAIndex/danRer7.fa -I /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam -o _metrics/sample1/medexpr//perBaseDoC.out -L _metrics/sample1/medexpr/intervals.list -l ERROR
    Arguments Array: [-T, DepthOfCoverage, -R, /home/sg15w/WES/Coel/BWAIndex/danRer7.fa, -I, /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam, -o, _metrics/sample1/medexpr//perBaseDoC.out, -L, _metrics/sample1/medexpr/intervals.list, -l, ERROR]
    GATK command result code: 0
    Depth of Coverage run time: 8 min
    ... GATK Depth of Coverage Analysis DONE
    Running GATK Depth of Coverage Analysis ....
    Arguments: -T DepthOfCoverage -R /home/sg15w/WES/Coel/BWAIndex/danRer7.fa -I /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam -o _metrics/sample1/highexpr//perBaseDoC.out -L _metrics/sample1/highexpr/intervals.list -l ERROR
    Arguments Array: [-T, DepthOfCoverage, -R, /home/sg15w/WES/Coel/BWAIndex/danRer7.fa, -I, /project/umw_michael_czech/BIOIFX-032/alignment/BWA/RealignedBamDir/tumor_aln_Filtered_SortedFixed_Reorder_Clean_MarkDup_AddRG.bam, -o, _metrics/sample1/highexpr//perBaseDoC.out, -L, _metrics/sample1/highexpr/intervals.list, -l, ERROR]
    org.broadinstitute.sting.utils.exceptions.UserException$MalformedGenomeLoc: Badly formed genome loc: Parameters to GenomeLocParser are incorrect:The genome loc coordinates 62098192-62098475 exceed the contig size (59938731)
    at org.broadinstitute.sting.utils.GenomeLocParser.vglHelper(GenomeLocParser.java:324)
    at org.broadinstitute.sting.utils.GenomeLocParser.validateGenomeLoc(GenomeLocParser.java:307)
    at org.broadinstitute.sting.utils.GenomeLocParser.createGenomeLoc(GenomeLocParser.java:265)
    at org.broadinstitute.sting.utils.GenomeLocParser.parseGenomeLoc(GenomeLocParser.java:389)
    at org.broadinstitute.sting.utils.interval.IntervalUtils.intervalFileToList(IntervalUtils.java:139)
    at org.broadinstitute.sting.utils.interval.IntervalUtils.parseIntervalArguments(IntervalUtils.java:71)
    at org.broadinstitute.sting.commandline.IntervalBinding.getIntervals(IntervalBinding.java:106)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.loadIntervals(GenomeAnalysisEngine.java:616)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.initializeIntervals(GenomeAnalysisEngine.java:583)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:233)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
    at org.broadinstitute.cga.rnaseq.gatk.GATKTools.runDoC(GATKTools.java:59)
    at org.broadinstitute.cga.rnaseq.PerBaseDoC.runDoC(PerBaseDoC.java:888)
    at org.broadinstitute.cga.rnaseq.PerBaseDoC.runDoC(PerBaseDoC.java:858)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.runMetrics(RNASeqMetrics.java:264)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:166)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:135)
    RNA-SeQC Total Runtime: 115 min

    Thanks for helping
    Sharvari

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    It looks like the package you're using is running an older version of GATK, which we can't support, unfortunately. If you can, you should upgrade your version of the GATK to run the latest version.

    Aside from that, it looks like you may have a problem in your intervals file. You should check that the intervals used at that step are appropriate for the data being processed in that step.

Sign In or Register to comment.