Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

CNNScoreVariants Running to end but failing to create a filtered file

Hi,

I am trying to run CNNScoreVariants on an exome using:

java -jar $GATK CNNScoreVariants -V ERR000589.raw.vcf -R $REFERENCE -O ERR000589.CNN.vcf

I have a bam file, aligned with BWA with read headers with a local version of GATK4 (4.1.2.0).

The tool seems to run to completion but fails to produce a vcf file after scoring.

12:50:54.774 INFO CNNScoreVariants - ------------------------------------------------------------
12:50:54.774 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
12:50:54.774 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
12:50:54.774 INFO CNNScoreVariants - Executing as [email protected] on Linux v4.15.0-50-generic amd64
12:50:54.774 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-8u212-b03-0ubuntu1.18.04.1-b03
12:50:54.775 INFO CNNScoreVariants - Start Date/Time: May 24, 2019 12:50:52 EDT PM
12:50:54.775 INFO CNNScoreVariants - ------------------------------------------------------------
12:50:54.775 INFO CNNScoreVariants - ------------------------------------------------------------
12:50:54.775 INFO CNNScoreVariants - HTSJDK Version: 2.19.0
12:50:54.775 INFO CNNScoreVariants - Picard Version: 2.19.0
12:50:54.775 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:50:54.775 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:50:54.775 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:50:54.775 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:50:54.775 INFO CNNScoreVariants - Deflater: IntelDeflater
12:50:54.776 INFO CNNScoreVariants - Inflater: IntelInflater
12:50:54.776 INFO CNNScoreVariants - GCS max retries/reopens: 20
12:50:54.776 INFO CNNScoreVariants - Requester pays: disabled
12:50:54.776 INFO CNNScoreVariants - Initializing engine
12:50:55.024 INFO FeatureManager - Using codec VCFCodec to read file file:///home/ricardo/Downloads/gatk-4.1.1.0/wdl/ERR000589.raw.vcf
12:50:55.081 INFO CNNScoreVariants - Done initializing engine
12:50:55.082 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/ricardo/Downloads/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
12:50:55.147 INFO CNNScoreVariants - Done scoring variants with CNN.
12:50:55.147 INFO CNNScoreVariants - Shutting down engine
[May 24, 2019 12:50:55 EDT PM] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=632815616
java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.hasMessage(ProcessControllerAckResult.java:49)
at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.getDisplayMessage(ProcessControllerAckResult.java:69)
at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:229)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:216)
at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:307)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1037)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)

Thanks for you help,

Ricardo

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @RicardoHarripaul

    From the stdout log it does not look like it ran to completion. Would you please share the entire stdout log.

  • RicardoHarripaulRicardoHarripaul TorontoMember

    Hi Bhanu,

    This is the full error stdout log.

    [email protected][wdl] java -jar $GATK CNNScoreVariants -V ERR000589.raw.vcf -R $REFERENCE -O ERR000589.CNN.vcf [ 3:33PM]
    15:33:27.290 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/ricardo/Downloads/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    May 25, 2019 3:33:29 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
    INFO: Failed to detect whether we are running on Google Compute Engine.
    15:33:29.583 INFO CNNScoreVariants - ------------------------------------------------------------
    15:33:29.583 INFO CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.1.2.0
    15:33:29.583 INFO CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
    15:33:29.583 INFO CNNScoreVariants - Executing as [email protected] on Linux v4.15.0-50-generic amd64
    15:33:29.583 INFO CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_212-8u212-b03-0ubuntu1.18.04.1-b03
    15:33:29.583 INFO CNNScoreVariants - Start Date/Time: May 25, 2019 3:33:27 EDT PM
    15:33:29.584 INFO CNNScoreVariants - ------------------------------------------------------------
    15:33:29.584 INFO CNNScoreVariants - ------------------------------------------------------------
    15:33:29.584 INFO CNNScoreVariants - HTSJDK Version: 2.19.0
    15:33:29.584 INFO CNNScoreVariants - Picard Version: 2.19.0
    15:33:29.584 INFO CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    15:33:29.584 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    15:33:29.584 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    15:33:29.584 INFO CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    15:33:29.584 INFO CNNScoreVariants - Deflater: IntelDeflater
    15:33:29.584 INFO CNNScoreVariants - Inflater: IntelInflater
    15:33:29.584 INFO CNNScoreVariants - GCS max retries/reopens: 20
    15:33:29.585 INFO CNNScoreVariants - Requester pays: disabled
    15:33:29.585 INFO CNNScoreVariants - Initializing engine
    15:33:30.104 INFO FeatureManager - Using codec VCFCodec to read file file:///home/ricardo/Downloads/gatk-4.1.1.0/wdl/ERR000589.raw.vcf
    15:33:30.163 INFO CNNScoreVariants - Done initializing engine
    15:33:30.164 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/home/ricardo/Downloads/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
    15:33:30.409 INFO CNNScoreVariants - Done scoring variants with CNN.
    15:33:30.409 INFO CNNScoreVariants - Shutting down engine
    [May 25, 2019 3:33:30 EDT PM] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.05 minutes.
    Runtime.totalMemory()=644349952
    java.lang.NullPointerException
    at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.hasMessage(ProcessControllerAckResult.java:49)
    at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.getDisplayMessage(ProcessControllerAckResult.java:69)
    at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:229)
    at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:216)
    at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
    at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:307)
    at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1037)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
    at org.broadinstitute.hellbender.Main.main(Main.java:291)

  • cnormancnorman United StatesMember, Broadie, Dev ✭✭

    @RicardoHarripaul It looks like the tool is not successfully getting through initialization. Are you running from within the GATK conda environment ? I suspect that you are, since the code appears to be getting past the check for that. Have you upgraded GATK since you created the conda environment ? Your conda environment may be out of-date with respect to the GATK version you're running.

  • RicardoHarripaulRicardoHarripaul TorontoMember

    @cnorman Thanks for your answer. I am actually running this on my Linux computer. I did get the code to run on Docker but I would prefer and want to understand why it is not running on my native system.

  • cnormancnorman United StatesMember, Broadie, Dev ✭✭

    @RicardoHarripaul The CNN tools require Python and a number of underlying Python dependencies, and will not work unless the correct versions of the dependencies are installed. The GATK Docker image has the dependencies already set up, but otherwise you'll need to set up the environment yourself using Conda. See https://github.com/broadinstitute/gatk#python for details.

  • xoaibxoaib JapanMember
    I am also facing the same problem. Please help.
  • cnormancnorman United StatesMember, Broadie, Dev ✭✭

    @xoaib Are you running from within the gatk conda environment as described here ? The environment must have been created using the version of GATK that you're running (I suspect this problem can occur if you have a conda environment from a previous gatk release). I would suggest recreating the conda environment using the gatk release your running.

Sign In or Register to comment.