We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Base Recalibration plots:RScript exited with 1

I want to create plots before and after Recalibration, and getting the error below:
I have checked for the package "ggplot2"which is required for generating graphs and also added the path of R script to my environment:
which is confirmed by :
$ which Rscript
/nfs/apps/R/2.15.1/bin/Rscript
ERROR stack trace
org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info.
at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174)
at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:550)
at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380)
at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)
ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.7-4-g6f46d11):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
Below is the command I am running:
/shares/jre1.7.0_40/bin/java -jar /shares/GenomeAnalysisTK-2.7-4-g6f46d11/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /shares/dbdata/human_g1k_v37.fasta -before /shares/bam_base_recalib/recal_data.table -after /shares/bam_base_recalib/post_recal_data.table -plots /shares/bam_base_recalib/BQSR.pdf
Best Answers
-
Geraldine_VdAuwera Cambridge, MA admin
You can try to run the BQSR Rscript directly, which will help you troubleshoot. To get the Rscipt, you'll need to get the source code from our github repository here: https://github.com/broadgsa/gatk-protected
-
Geraldine_VdAuwera Cambridge, MA admin
The first step of a recalibration run is the process of calculating the empirical qualities and contrasting them to the existing (reported) qualities in the file, in order to predict what adjustments need to be made. So in order to do the before/after plots, you need to run this on your recalibrated file to find out if the recalibration brought your base qualities closer to reality or not. Make sense?
Answers
You can try to run the BQSR Rscript directly, which will help you troubleshoot. To get the Rscipt, you'll need to get the source code from our github repository here: https://github.com/broadgsa/gatk-protected
The rest of the packages gplots,reshape were missing in R , now the script works from GATK
Great, glad to hear it.
HI Geraldine,
I wanted to know the purpose of the second pass in recalibration ,since its does not recalibrate again but while generating the plots we compare both the recalibration passes.
The first step is termed "before recalibration" and the second is "after recalibration" but the first step itself is doing recalibration.so why not compare before doing the first pass and after the first pass which recalibrates.
@rb2905,
The first step of a recalibration run is the process of calculating the empirical qualities and contrasting them to the existing (reported) qualities in the file, in order to predict what adjustments need to be made. So in order to do the before/after plots, you need to run this on your recalibrated file to find out if the recalibration brought your base qualities closer to reality or not. Make sense?
Yes,it definitely makes sense but the commands listed make a bit hard to understand.
As,In the second pass we are not doing any recalibration on a recalibrated bam file but just giving the input of the a recalibrated table created in the first pass, to generate a table of post recalibration values
Hmm, I see your point. It is actually functionally equivalent to running BaseRecalibrator on the recalibrated file without the BQSR argument. But ideally you should make the plots before generating the recalibrated file, which is why we give this version of the commands. I'll see if I can make this a little clearer in the explanation.
That would be great.
Thanks again!
The steps mentioned for Base Recalibration are clearer here in this link (Walkthrough of Oct2013 GATK workshop)
http://www.broadinstitute.org/gatk/guide/topic?name=tutorials
since in the second pass they use recalibrated bam file(after Print reads) as an input(recalibrated_20.bam) to the second pass.
java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I realigned_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o recal_20.table -L 20
java -jar GenomeAnalysisTK.jar -T PrintReads -R human_b37_20.fasta -I realigned_20.bam -BQSR recal_20.table -o recal_20.bam -L 20
java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R human_b37_20.fasta -I recalibrated_20.bam -knownSites dbsnp_b37_20.vcf -knownSites indels_b37_20.vcf -o post_recal_20.table -L 20
java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R human_b37_20.fasta -before recal_20.table -after post_recal_20.table -plots recalibration_plots.pdf -L 20
Thanks,
Rohan
I'm getting the same error as above, but I'm not clear what you did to fix it exactly. Which R packages does GATK need to run AnalyzeCovariates?
The gsalib package from CRAN along with all dependencies (ggplot2, gplot etc).
Hi Geraldine,
I'm having the same issue:
I Run:
java -Xmx4g -jar $GATKDir/GenomeAnalysisTK.jar -T AnalyzeCovariates -R /media/sf_SequencingData/Genomes/Briggsae_Genomes/Cbriggsae_WS238/briggsae.WS238.genome_masked.fa -before HK104_1-recal_data-1.table -after HK104_1-post_recal_data-1.table -plots HK104_1-recalibration_plots1.pdf
I get the following error message:
ERROR ------------------------------------------------------------------------------------------
ERROR stack trace
org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info.
at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174)
at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:548)
at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380)
at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394)
at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)
ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
ERROR ------------------------------------------------------------------------------------------
I have R installed, as confirmed by running: $ which Rscript, which gives:
/usr/bin/Rscript
Within RStudio, I verified the installed packages by running:
And I see, among all the many packages there, with a correct LibPath:
gplots "/home/fabrice/R/x86_64-pc-linux-gnu-library/3.0"
gsalib "/home/fabrice/R/x86_64-pc-linux-gnu-library/3.0"
Would you know what's going wrong?
Finally, I don't get what you mean by saying:
Do you mean, within R studio, launching a command with recal_data.table & post_recal_data.table to write the plots ?
Thank you !
Fabrice
Yes, exactly that. You will get more immediate information about what's going wrong.
Hi,
Thank you again for your quick reply !
I went to GitHub at this location: gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting /
and look around for the Rscript of "analyze covariates" from BQSR... but I found nothing I expected !
In gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting / gatk / walkers / bqsr /,
but I found only .java files...
Same thing in: gatk-protected / protected / gatk-protected / src / main / java / org / broadinstitute / sting / utils / recalibration /
Then, I'm not really sure of what to do with R once I get the script. Tell me if I am wrong:
-Can I save the script of AnalyzeCovariates anywhere in my computer or somewhere in a dedicated R directory?
-I start R studio and I source the script using: >source("/path/to/the/Rscript) ?
-and then? What are the arguments to use? Will the script asked me where my input files are? etc...
Fabrice
Hi Fabrice, the script you want is here:
https://github.com/broadgsa/gatk-protected/blob/master/public/gatk-framework/src/main/resources/org/broadinstitute/sting/utils/recalibration/BQSR.R
You'll need to make AnalyzeCovariates save the intermediate csv script to your working directory using the
--intermediateCsvFile
argument. The command you need to run should be in the log of your AnalyzeCovariates run; you may need to run withl DEBUG
(I forget if it's logged by default).I had the same problem and installing the R package reshape worked.
I ran the GTAK command normally after rehape and it worked flawlessly.
i cant find BQSR.R on the https://github.com/broadgsa/gatk-protected/blob/master/public/gatk-framework/src/main/resources/org/broadinstitute/sting/utils/recalibration/BQSR.R
@xcq
Hi,
This article should help: http://gatkforums.broadinstitute.org/discussion/4294/analyzecovariates-fails-with-error-message-rscript-exited-with-1
-Sheila
I found the R script here - https://github.com/broadgsa/gatk-protected/blob/e91472ddc7d58ace52db0cab4d70a072a918d64c/public/gatk-engine/src/main/resources/org/broadinstitute/gatk/engine/recalibration/BQSR.R
Why dont't you start your R Scripts with something like that for the required libraries.
if ( ! "" %in% rownames(installed.packages()) ) {
install.packages("")
}
This might make it more comfortable for some users (myself included). I spend some time reading source code before I found this discussion. But now that I did, it works. Thanks!