Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

AnalyzeCovariates error (R)

RadRad CAMember
edited May 2014 in Ask the GATK team

Hello

I am trying to generate a base recalibration plots using AnalyzeCovariate

My command is such

java -jar GenomeAnalysisTK.jar \
-T AnalyzeCovariates -R GRCh37-lite.fa \
-before test_data/realigned/SA495-Tumor.sorted.realigned.grp \
-after test_data/realigned/SA495-Tumor.sorted.post_recal.grp2 \
-plots recal_plots.pdf

and this gives me an error

INFO  17:01:06,050 HelpFormatter - Date/Time: 2014/05/16 17:01:06
INFO  17:01:06,050 HelpFormatter - --------------------------------------------------------------------------------
INFO  17:01:06,050 HelpFormatter - --------------------------------------------------------------------------------
INFO  17:01:06,962 GenomeAnalysisEngine - Strictness is SILENT
INFO  17:01:07,193 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO  17:01:07,317 GenomeAnalysisEngine - Preparing for traversal
INFO  17:01:07,339 GenomeAnalysisEngine - Done preparing for traversal
INFO  17:01:07,340 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO  17:01:07,340 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining
INFO  17:01:08,293 ContextCovariate -       Context sizes: base substitution model 2, indel substitution model 3
INFO  17:01:08,537 ContextCovariate -       Context sizes: base substitution model 2, indel substitution model 3
INFO  17:01:08,592 AnalyzeCovariates - Generating csv file '/tmp/AnalyzeCovariates3565832248324656361.csv'
INFO  17:01:09,077 AnalyzeCovariates - Generating plots file 'recal_plots.pdf'
INFO  17:01:18,598 GATKRunReport - Uploaded run statistics report to AWS S3
 ERROR ------------------------------------------------------------------------------------------
 ERROR stack trace
org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info.
    at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174)
    at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:548)
    at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380)
    at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)
 ERROR ------------------------------------------------------------------------------------------
 ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
 ERROR
 ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
 ERROR If not, please post the error message, with stack trace, to the GATK forum.
 ERROR Visit our website and forum for extensive documentation and answers to
 ERROR commonly asked questions http://www.broadinstitute.org/gatk
 ERROR
 ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
 ERROR ------------------------------------------------------------------------------------------

Ideas ?
Thanks

Best Answer

Answers

  • RadRad CAMember

    [UPDATE]

    I asked to fast
    Before using it we should check if libraries are installed in R

    libraries to check

    library("ggplot2")
    library(gplots)
    library("reshape")
    library("grid")
    library("tools") #For compactPDF in R 2.13+
    library(gsalib)
    
  • XBonXBon SwitzerlandMember

    Hello,

    I am having the same problem. I've installed all the libraries mentioned above by Rad, but I get the same error.

    java -jar GenomeAnalysisTK.jar \
    -T AnalyzeCovariates -R ucsc.hg19.fasta \
    -before recal_data_B2.table \
    -after post_recal_data_B2.table \
    -plots recalibration_plots_B2.pdf

    INFO 10:11:53,116 HelpFormatter - Date/Time: 2014/06/02 10:11:53
    INFO 10:11:53,116 HelpFormatter - --------------------------------------------------------------------------------
    INFO 10:11:53,117 HelpFormatter - --------------------------------------------------------------------------------
    INFO 10:11:54,218 GenomeAnalysisEngine - Strictness is SILENT
    INFO 10:11:54,490 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO 10:11:54,616 GenomeAnalysisEngine - Preparing for traversal
    INFO 10:11:54,654 GenomeAnalysisEngine - Done preparing for traversal
    INFO 10:11:54,654 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO 10:11:54,655 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 10:11:56,912 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
    INFO 10:11:57,468 ContextCovariate - Context sizes: base substitution model 2, indel substitution model 3
    INFO 10:11:57,597 AnalyzeCovariates - Generating csv file '/tmp/AnalyzeCovariates3167030638119662229.csv'
    INFO 10:11:58,438 AnalyzeCovariates - Generating plots file 'recalibration_plots_B2.pdf'
    INFO 10:12:01,067 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.sting.utils.R.RScriptExecutorException: RScript exited with 1. Run with -l DEBUG for more info.
    at org.broadinstitute.sting.utils.R.RScriptExecutor.exec(RScriptExecutor.java:174)
    at org.broadinstitute.sting.utils.recalibration.RecalUtils.generatePlots(RecalUtils.java:548)
    at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.generatePlots(AnalyzeCovariates.java:380)
    at org.broadinstitute.sting.gatk.walkers.bqsr.AnalyzeCovariates.initialize(AnalyzeCovariates.java:394)
    at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:83)
    at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:313)
    at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:121)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:107)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 3.1-1-g07a4bf8):
    ERROR
    ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
    ERROR If not, please post the error message, with stack trace, to the GATK forum.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: RScript exited with 1. Run with -l DEBUG for more info.
    ERROR ------------------------------------------------------------------------------------------

    What could be the problem? Thank you for your help!

  • XBonXBon SwitzerlandMember

    Hi Sheila,

    I tried what you suggest and I think I have the very same problem discussed in the thread you linked. I will try now what is discussed there, but it seems I won't be able to make it work smoothly as of right now.

    Many thanks!

  • tedtoaltedtoal Member

    I had this problem. Some possible causes can be that RScript cannot be found (not on path), or that the R on the path does not have the above-listed packages installed, or that LD_LIBRARY_PATH does not point to the right place for R to be able to load shared Java .so libraries it needs (might not be required for AnalyzeCovariates, depends on whether required R packages use the Rjava package).

    I set up a locally-installed version of R, with all the packages needed (I used a conda environment, which made it pretty easy to set up). Then, it was necessary to make sure GATK would access this R. I did not want to put that R on my path (and I did not want to force the user running this to activate the conda environment in which I had installed R), so I did it by invoking GATK with a prefixed "env" command and settings for LD_LIBRARY_PATH and PATH, as follows:

    env -i LD_LIBRARY_PATH=(path to my R)/lib/R/library/rJava/libs:(path to my R)/jre/lib/amd64/server PATH=/bin:(path to my R)/bin (path to my java)/bin/java -Xmx8g -Djava.io.tmpdir=(path to my tmp dir) -jar (path to my GATK)/GenomeAnalysisTK.jar --analysis_type AnalyzeCovariates (etc.)
    

    Using the GATK -l (lower case L) DEBUG option is useful, it shows R error message output, which can point directly to the problem. I also had a problem where a system library could not be loaded. Took a while to find that I needed to reinstall the "stringi" package in R. That package isn't listed above, maybe it should be??

    Issue · Github
    by Sheila

    Issue Number
    2403
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @tedtoal
    Hi,

    Thanks for the suggestion. We will look into this.

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @tedtoal That might be a dependency of one of the packages we use. We don't list all transitive dependencies because that way lies madness...

  • wbsimeywbsimey California Academy of SciencesMember
    edited August 6

    I am having an R issue running gatk AnalyzeCovariates.
    I am using the latest conda install of gatk4. I used conda to install R in the same gatk conda environment, but keep getting the error "Stderr: Error in library("ggplot2") : there is no package called ‘ggplot2’". I have tried very hard to find a recent solution in the forums, but keep finding solutions that refer to links that date as far back as 2015. Is there a solution for 2019 gatk4?

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @wbsimey - I will follow up on this. In the meantime, not sure if this tutorial helps, but I will link it anyway.

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @wbsimey ,

    Here is a script called install_R_packages.R that shows the required dependencies. It can be run to install the required R libraries once R is installed.

    FYI we don't actually maintain the bioconda packages you are trying to use; they were created by someone else. The gatk docker image, however, has the R dependencies all set up if you decide to use it. Hope this helps!

  • wbsimeywbsimey California Academy of SciencesMember

    Thank you Tiffany,
    I was not able to find a 'gatkcondaenv.yml' file in my conda env, but I did not use miniconda. I did, however, get permission to install a docker instance, which did generate the AnalyzeCovariates plots, yay. But, I am getting java errors in spite of generating successful plots. I am not going to worry about the error since I got my plots, but here is the error if you are interested:

    [August 12, 2019 6:02:13 PM UTC] org.broadinstitute.hellbender.tools.walkers.bqsr.AnalyzeCovariates done. Elapsed time: 0.08 minutes.
    Runtime.totalMemory()=2140143616
    Tool returned:
    Optional.empty
    Exception in thread "Thread-1" htsjdk.samtools.util.RuntimeIOException: java.nio.file.NoSuchFileException: /tmp/Rlib.6607455256217462185
    at htsjdk.samtools.util.IOUtil.recursiveDelete(IOUtil.java:1346)
    at org.broadinstitute.hellbender.utils.io.IOUtils.deleteRecursively(IOUtils.java:1061)
    at org.broadinstitute.hellbender.utils.io.DeleteRecursivelyOnExitPathHook.runHooks(DeleteRecursivelyOnExitPathHook.java:56)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.nio.file.NoSuchFileException: /tmp/Rlib.6607455256217462185
    at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
    at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
    at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
    at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
    at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
    at java.nio.file.Files.readAttributes(Files.java:1737)
    at java.nio.file.FileTreeWalker.getAttributes(FileTreeWalker.java:219)
    at java.nio.file.FileTreeWalker.visit(FileTreeWalker.java:276)
    at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:322)
    at java.nio.file.Files.walkFileTree(Files.java:2662)
    at java.nio.file.Files.walkFileTree(Files.java:2742)
    at htsjdk.samtools.util.IOUtil.recursiveDelete(IOUtil.java:1344)
    ... 3 more
    (gatk) [email protected]:/gatk/home/GATK4/GATK_tests#

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Thanks for sharing. I will pass this along.

  • wbsimeywbsimey California Academy of SciencesMember

    This was my command in a docker instance:
    gatk AnalyzeCovariates \ -before TP29969_BQSRrnd2_data.table \ -after TP29969_BQSRrnd3_data.table \ -plots TP29969_BQSRrnd3_plots.pdf

    Here are the resulting plots:

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @wbsimey - I think this is related to a bug that was found in AnalyzeCovariates here What version of the gatk is in the Docker you are using? Looks like it was a bug in gatk 4.1.1.0 and 4.1.2.0, but should be fixed in 4.1.3.0

  • wbsimeywbsimey California Academy of SciencesMember

    OK, my docker instance has:
    The Genome Analysis Toolkit (GATK) v4.1.1.0
    HTSJDK Version: 2.19.0
    Picard Version: 2.19.0
    I will update gatk4 to 4.1.3.0

Sign In or Register to comment.