Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

GATK RUNTIME ERROR during AnalyzeCovariates

kjclowerskjclowers UW MadisonPosts: 14Member

I am getting the following error while trying to recalibrate base quality scores at the AnalyzeCovariates step:

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info. at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174) at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:550) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380) at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:311) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 2.6-5-gba531bd):
ERROR
ERROR Please check the documentation guide to see if this is a known problem
ERROR If not, please post the error, with stack trace, to the GATK forum
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
ERROR ------------------------------------------------------------------------------------------

Thanks, Katie

Comments

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Hi Katie,

    Have you installed gsalib and set up your R environment as instructed in the documentation?

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    Is R Studio only needed for installing the packages? The packages are installed but I do not have R Studio.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    RStudio is essentially just for installing the packages, yes. We recommend that because we know it will install them in the right place to be available to R calls from the shell.

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    Do you suspect the packages were installed incorrectly on my server given my error above? If so, I can have our server administrator try to install them again.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Yes, that's a common cause of problems. One way to know if that's the case is to run the BQSR.R script directly from command line. That's the script that AnalyzeCovariates calls internally to produce the plots; you need to run it on the csv file produced by AnalyzeCovariates in the first part of its run.

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member
    edited September 2013

    I am unsure on how to run it on a specific file, but here is what I did:

    Yes, Katie? ./BQSR.R -KCY30.sorted.RecalData.table
    Output:
    Loading required package: methods
    Error in library(gplots) : there is no package called 'gplots'
    Execution halted
    
    Post edited by Geraldine_VdAuwera on
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    That's great -- it's telling you that one of the dependencies (the gplots package) is not installed. I'm not sure why it wasn't installed along with the others but you can have your sysadmin install that for you. Then try running the script again and see if it still complains about any missing packages.

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    Yep, there are more packages missing: reshape.

    Here is my output: "Loading required package: methods Loading required package: gtools Loading required package: gdata gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

    gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

    Attaching package: 'gdata'

    The following object is masked from 'package:stats':

    nobs
    

    The following object is masked from 'package:utils':

    object.size
    

    Loading required package: caTools Loading required package: grid Loading required package: KernSmooth KernSmooth 2.23 loaded Copyright M. P. Wand 1997-2009 Loading required package: MASS

    Attaching package: 'gplots'

    The following object is masked from 'package:stats':

    lowess
    

    Error in library("reshape") : there is no package called 'reshape' Execution halted"

  • kjclowerskjclowers UW MadisonPosts: 14Member

    I think all of the packages are installed now, and I'm still getting the original error when trying to run Analyze Covariates. When trying to run the BQSR.R script, I now get an error saying my file (the csv file from before recalibration) doesn't exist. I'm guessing I am not passing it correctly in the command line. The script and the table are both in the same directory, so I don't understand why it thinks it doesn't exist.

    ./BQSR.R -KCY30.sorted.RecalData.table Output: Loading required package: methods Loading required package: gtools Loading required package: gdata gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

    gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

    Attaching package: 'gdata'

    The following object is masked from 'package:stats':

    nobs
    

    The following object is masked from 'package:utils':

    object.size
    

    Loading required package: caTools Loading required package: grid Loading required package: KernSmooth KernSmooth 2.23 loaded Copyright M. P. Wand 1997-2009 Loading required package: MASS

    Attaching package: 'gplots'

    The following object is masked from 'package:stats':

    lowess
    

    Loading required package: plyr

    Attaching package: 'reshape'

    The following object is masked from 'package:plyr':

    rename, round_any
    

    Error in file(file, "rt") : cannot open the connection Calls: read.csv -> read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file '-KCY30.sorted.RecalData.table': No such file or directory Execution halted

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Hmm, make sure you're specifying the path to the file relative to the R script; it might not be looking in the working directory proper.

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    I tried specifying the path to the file and have the same error.

  • kjclowerskjclowers UW MadisonPosts: 14Member

    Do you have any reason to believe that the error I am getting with Analyze Covariates is because we didn't use RStudio to install the packages? I'm trying to convince my server admin to try it.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    I would try to focus on getting the BQSR.R to run, there's no reason why it shouldn't work if it can load all packages alright.

    Make sure you're running it on the intermediate file produced by AC, not the recalibration table, btw. You may need to specify the -csv argument to keep the file; we've changed the behavior a couple of times and I forget if we're keeping it or deleting it by default. Have a look here: http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_bqsr_AnalyzeCovariates.html

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    I still cannot get BQSR.R to run, but I found some errors in my input .bam files so that my original Analyze Covariates error is gone. I must not be invoking BQSR.R correctly or am still using the wrong file. Here is my output from BQSR.R: ./BQSR.R ~/Sequencing_Projects/BulkSeg/TestingGATK/report.csv Loading required package: methods Loading required package: gtools Loading required package: gdata gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.

    gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.

    Attaching package: 'gdata'

    The following object is masked from 'package:stats':

    nobs
    

    The following object is masked from 'package:utils':

    object.size
    

    Loading required package: caTools Loading required package: grid Loading required package: KernSmooth KernSmooth 2.23 loaded Copyright M. P. Wand 1997-2009 Loading required package: MASS

    Attaching package: 'gplots'

    The following object is masked from 'package:stats':

    lowess
    

    Loading required package: plyr

    Attaching package: 'reshape'

    The following object is masked from 'package:plyr':

    rename, round_any
    

    Error in file(filename, "r", blocking = TRUE) : cannot open the connection Calls: gsa.read.gatkreport -> file In addition: Warning message: In file(filename, "r", blocking = TRUE) : cannot open file 'NA': No such file or directory Execution halted

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Oh, are you saying that AnalyzeCovariates now runs successfully? Do you get plots then, from the AC run?

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Well that's great :)

    There may be a subtlety to running the BQSR.R script independently -- I must confess I only ever run it via RStudio if I need to do this, so maybe from command-line there is an issue.

    But hey, if you've got AC working properly, you should be ok to continue with your work.

    Geraldine Van der Auwera, PhD

  • kjclowerskjclowers UW MadisonPosts: 14Member

    Thank you very much for your help.

  • kraigrskraigrs Ann Arbor, Michigan, USAPosts: 8Member

    I'm also having trouble running AnalyzeCovariates. I am running this on a cluster, so installing RStudio isn't possible, but I've installed the appropriate packages "ggplot2" and "gsalib" and their dependencies. Where is the location of the BQSR.R file? I think the problem is that this file was not made universally executable on the cluster, so AnalyzeCovariates is having trouble finding it. Where can I find BQSR.R?

  • kraigrskraigrs Ann Arbor, Michigan, USAPosts: 8Member

    Thank you very much! Do you think that if I made this script universally executable, that the AnalyzeCovariates walker would function normally, or would you recommend to simply run this script separately from AnalyzeCovariates?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,235Administrator, GSA Member admin

    Typically when people have trouble getting the plots I recommend running the script manually because then you get to the R debugging info more easily. I would be surprised if this was actually a permissions problem, but hey, try it and see :)

    Geraldine Van der Auwera, PhD

  • kraigrskraigrs Ann Arbor, Michigan, USAPosts: 8Member

    Works perfect, just needed the BQSR.R script, thank you very much!

Sign In or Register to comment.