Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

Base Recalibration plots:RScript exited with 1

rb2905rb2905 New YorkPosts: 13Member

I want to create plots before and after Recalibration, and getting the error below:

I have checked for the package "ggplot2"which is required for generating graphs and also added the path of R script to my environment: which is confirmed by :

$ which Rscript

/nfs/apps/R/2.15.1/bin/Rscript

ERROR stack trace

org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info. at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174) at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:550) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-4-g6f46d11):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.

Below is the command I am running:

/shares/jre1.7.0_40/bin/java -jar /shares/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /shares/dbdata/human_g1k_v37.fasta -before /shares/bam_base_recalib/recal_data.table -after /shares/bam_base_recalib/post_recal_data.table -plots /shares/bam_base_recalib/BQSR.pdf

Best Answers

Answers

  • rb2905rb2905 New YorkPosts: 13Member

    The rest of the packages gplots,reshape were missing in R , now the script works from GATK

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,295Administrator, GATK Developer admin

    Great, glad to hear it.

    Geraldine Van der Auwera, PhD

  • rb2905rb2905 New YorkPosts: 13Member

    HI Geraldine,

    I wanted to know the purpose of the second pass in recalibration ,since its does not recalibrate again but while generating the plots we compare both the recalibration passes.

    The first step is termed "before recalibration" and the second is "after recalibration" but the first step itself is doing recalibration.so why not compare before doing the first pass and after the first pass which recalibrates.

  • rb2905rb2905 New YorkPosts: 13Member

    Yes,it definitely makes sense but the commands listed make a bit hard to understand. As,In the second pass we are not doing any recalibration on a recalibrated bam file but just giving the input of the a recalibrated table created in the first pass, to generate a table of post recalibration values

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,295Administrator, GATK Developer admin

    Hmm, I see your point. It is actually functionally equivalent to running BaseRecalibrator on the recalibrated file without the BQSR argument. But ideally you should make the plots before generating the recalibrated file, which is why we give this version of the commands. I'll see if I can make this a little clearer in the explanation.

    Geraldine Van der Auwera, PhD

  • rb2905rb2905 New YorkPosts: 13Member
  • rb2905rb2905 New YorkPosts: 13Member

    The steps mentioned for Base Recalibration are clearer here in this link (Walkthrough of Oct2013 GATK workshop) http://www.broadinstitute.org/gatk/guide/topic?name=tutorials

    since in the second pass they use recalibrated bam file(after Print reads) as an input(recalibrated_20.bam) to the second pass.

    java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I realigned_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o recal_20.table -L 20

    java -jar GenomeAnalysisTK.jar -T PrintReads -R human_b37_20.fasta -I realigned_20.bam -BQSR recal_20.table -o recal_20.bam -L 20

    java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I recalibrated_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o post_recal_20.table -L 20

    java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R human_b37_20.fasta -before recal_20.table -after post_recal_20.table -plots recalibration_plots.pdf -L 20

    Thanks, Rohan

  • ChrisPattersonChrisPatterson Epilepsy Genomics CenterPosts: 7Member

    I'm getting the same error as above, but I'm not clear what you did to fix it exactly. Which R packages does GATK need to run AnalyzeCovariates?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,295Administrator, GATK Developer admin

    The gsalib package from CRAN along with all dependencies (ggplot2, gplot etc).

    Geraldine Van der Auwera, PhD

  • FabriceBesnardFabriceBesnard ParisPosts: 17Member
    edited April 3

    Hi Geraldine,

    I'm having the same issue:

    I Run: java -Xmx4g -jar $GATKDir/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /media/sf_SequencingData/Genomes/Briggsae_Genomes/Cbriggsae_WS238/briggsae.WS238.genome_masked.fa -before HK104_1-recal_data-1.table -after HK104_1-post_recal_data-1.table -plots HK104_1-recalibration_plots1.pdf

    I get the following error message:

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info. at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174) at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:548) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
    ERROR ------------------------------------------------------------------------------------------

    I have R installed, as confirmed by running: $ which Rscript, which gives: /usr/bin/Rscript

    Within RStudio, I verified the installed packages by running:

    installed.packages()

    And I see, among all the many packages there, with a correct LibPath:

    ggplot2 "/home/fabrice/R/x86_64-pc-linux-gnu-library/3.0" gplots "/home/fabrice/R/x86_64-pc-linux-gnu-library/3.0" gsalib "/home/fabrice/R/x86_64-pc-linux-gnu-library/3.0"

    Would you know what's going wrong?

    Finally, I don't get what you mean by saying:

    "You can try to run the BQSR Rscript directly"

    Do you mean, within R studio, launching a command with recal_data.table & post_recal_data.table to write the plots ?

    Thank you ! Fabrice

    Post edited by FabriceBesnard on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,295Administrator, GATK Developer admin

    Do you mean, within R studio, launching a command with recal_data.table & post_recal_data.table to write the plots ?

    Yes, exactly that. You will get more immediate information about what's going wrong.

    Geraldine Van der Auwera, PhD

  • FabriceBesnardFabriceBesnard ParisPosts: 17Member

    Hi,

    Thank you again for your quick reply ! I went to GitHub at this location: gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting / and look around for the Rscript of "analyze covariates" from BQSR... but I found nothing I expected !

    In gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting / gatk / walkers / bqsr /, but I found only .java files... Same thing in: gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting / utils / recalibration /

    Then, I'm not really sure of what to do with R once I get the script. Tell me if I am wrong: -Can I save the script of AnalyzeCovariates anywhere in my computer or somewhere in a dedicated R directory? -I start R studio and I source the script using: >source("/path/to/the/Rscript) ? -and then? What are the arguments to use? Will the script asked me where my input files are? etc...

    Fabrice

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,295Administrator, GATK Developer admin
    edited April 4

    Hi Fabrice, the script you want is here:

    https://github.com/broadgsa/gatk-protected/blob/master/public/gatk-framework/src/main/resources/org/broadinstitute/sting/utils/recalibration/BQSR.R

    You'll need to make AnalyzeCovariates save the intermediate csv script to your working directory using the --intermediateCsvFile argument. The command you need to run should be in the log of your AnalyzeCovariates run; you may need to run with l DEBUG (I forget if it's logged by default).

    Post edited by Geraldine_VdAuwera on

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.