The current GATK version is 3.4-46

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# GATK RUNTIME ERROR during AnalyzeCovariates

I am getting the following error while trying to recalibrate base quality scores at the AnalyzeCovariates step:

##### ERROR stack trace

org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info.

##### ERROR ------------------------------------------------------------------------------------------

Thanks,
Katie

Tagged:

Hi Katie,

Have you installed gsalib and set up your R environment as instructed in the documentation?

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

Is R Studio only needed for installing the packages? The packages are installed but I do not have R Studio.

RStudio is essentially just for installing the packages, yes. We recommend that because we know it will install them in the right place to be available to R calls from the shell.

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

Do you suspect the packages were installed incorrectly on my server given my error above? If so, I can have our server administrator try to install them again.

Yes, that's a common cause of problems. One way to know if that's the case is to run the BQSR.R script directly from command line. That's the script that AnalyzeCovariates calls internally to produce the plots; you need to run it on the csv file produced by AnalyzeCovariates in the first part of its run.

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member
edited September 2013

I am unsure on how to run it on a specific file, but here is what I did:

Yes, Katie? ./BQSR.R -KCY30.sorted.RecalData.table
Output:
Error in library(gplots) : there is no package called 'gplots'
Execution halted

Post edited by Geraldine_VdAuwera on

That's great -- it's telling you that one of the dependencies (the gplots package) is not installed. I'm not sure why it wasn't installed along with the others but you can have your sysadmin install that for you. Then try running the script again and see if it still complains about any missing packages.

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

Yep, there are more packages missing: reshape.

Here is my output:
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

Attaching package: 'gdata'

The following object is masked from 'package:stats':

nobs


The following object is masked from 'package:utils':

object.size


Copyright M. P. Wand 1997-2009

Attaching package: 'gplots'

The following object is masked from 'package:stats':

lowess


Error in library("reshape") : there is no package called 'reshape'
Execution halted"

• UW MadisonPosts: 14Member

I think all of the packages are installed now, and I'm still getting the original error when trying to run Analyze Covariates. When trying to run the BQSR.R script, I now get an error saying my file (the csv file from before recalibration) doesn't exist. I'm guessing I am not passing it correctly in the command line. The script and the table are both in the same directory, so I don't understand why it thinks it doesn't exist.

./BQSR.R -KCY30.sorted.RecalData.table
Output:
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

Attaching package: 'gdata'

The following object is masked from 'package:stats':

nobs


The following object is masked from 'package:utils':

object.size


Copyright M. P. Wand 1997-2009

Attaching package: 'gplots'

The following object is masked from 'package:stats':

lowess


Attaching package: 'reshape'

The following object is masked from 'package:plyr':

rename, round_any


Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file '-KCY30.sorted.RecalData.table': No such file or directory
Execution halted

Hmm, make sure you're specifying the path to the file relative to the R script; it might not be looking in the working directory proper.

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

I tried specifying the path to the file and have the same error.

• UW MadisonPosts: 14Member

Do you have any reason to believe that the error I am getting with Analyze Covariates is because we didn't use RStudio to install the packages? I'm trying to convince my server admin to try it.

I would try to focus on getting the BQSR.R to run, there's no reason why it shouldn't work if it can load all packages alright.

Make sure you're running it on the intermediate file produced by AC, not the recalibration table, btw. You may need to specify the -csv argument to keep the file; we've changed the behavior a couple of times and I forget if we're keeping it or deleting it by default. Have a look here: http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_bqsr_AnalyzeCovariates.html

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

I still cannot get BQSR.R to run, but I found some errors in my input .bam files so that my original Analyze Covariates error is gone. I must not be invoking BQSR.R correctly or am still using the wrong file. Here is my output from BQSR.R:
./BQSR.R ~/Sequencing_Projects/BulkSeg/TestingGATK/report.csv
gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

Attaching package: 'gdata'

The following object is masked from 'package:stats':

nobs


The following object is masked from 'package:utils':

object.size


Copyright M. P. Wand 1997-2009

Attaching package: 'gplots'

The following object is masked from 'package:stats':

lowess


Attaching package: 'reshape'

The following object is masked from 'package:plyr':

rename, round_any


Error in file(filename, "r", blocking = TRUE) :
cannot open the connection
Calls: gsa.read.gatkreport -> file
In addition: Warning message:
In file(filename, "r", blocking = TRUE) :
cannot open file 'NA': No such file or directory
Execution halted

Oh, are you saying that AnalyzeCovariates now runs successfully? Do you get plots then, from the AC run?

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

Yes, I get plots now.

Well that's great

There may be a subtlety to running the BQSR.R script independently -- I must confess I only ever run it via RStudio if I need to do this, so maybe from command-line there is an issue.

But hey, if you've got AC working properly, you should be ok to continue with your work.

Geraldine Van der Auwera, PhD

• UW MadisonPosts: 14Member

Thank you very much for your help.

• Ann Arbor, Michigan, USAPosts: 8Member

I'm also having trouble running AnalyzeCovariates. I am running this on a cluster, so installing RStudio isn't possible, but I've installed the appropriate packages "ggplot2" and "gsalib" and their dependencies. Where is the location of the BQSR.R file? I think the problem is that this file was not made universally executable on the cluster, so AnalyzeCovariates is having trouble finding it. Where can I find BQSR.R?

Geraldine Van der Auwera, PhD

• Ann Arbor, Michigan, USAPosts: 8Member

Thank you very much! Do you think that if I made this script universally executable, that the AnalyzeCovariates walker would function normally, or would you recommend to simply run this script separately from AnalyzeCovariates?

Typically when people have trouble getting the plots I recommend running the script manually because then you get to the R debugging info more easily. I would be surprised if this was actually a permissions problem, but hey, try it and see

Geraldine Van der Auwera, PhD

• Ann Arbor, Michigan, USAPosts: 8Member

Works perfect, just needed the BQSR.R script, thank you very much!

• Posts: 6Member

I'm having a similar issue with getting the AnalyzeCovariates to make the plots. I wanted to try running the BQSR.R script, but I don't think those github pages exist any more. Is there anywhere else I can find that script? Thanks!