Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

CNVDiscoveryPipeline failed in stage 1

Hi,

I met an error on running CNVDiscoveryPipeline in the first stage, seems because of reaching LSF memory usage limit:

Here is the log file:
"
ERROR 15:53:48,754 FunctionEdge - Error: 'java' '-Xmx65536m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/cnv/wgs/process/.queue/tmp' '-cp' '/cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar' '-cp' '/cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.gatk.queue.QCommandLine' '-cp' '/cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar' '-S' '/cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryStage1.q' '-S' '/cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryStageBase.q' '-S' '/cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryGenotyper.q' '-S' '/cnv/wgs/tools/svtoolkit/qscript/SVQScript.q' '-gatk' '/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar' '-jobLogDir' 'cnv1010_v2/run/cnv_stage1/seq_1/logs' '-memLimit' '64.0' '-jobRunner' 'Lsf706' '-gatkJobRunner' 'Lsf706' '-jobQueue' 'long' -run '-sequenceName' '1' '-runDirectory' 'cnv1010_v2/run/cnv_stage1/seq_1' '-sentinelFile' 'cnv1010_v2/run/cnv_sentinel_files/stage_1_seq_1.sent' --disableJobReport '-configFile' '/cnv/wgs/tools/svtoolkit/conf/genstrip_parameters.txt' '-R' '/cnv/build37/build37.fasta' '-ploidyMapFile' '/cnv/wgs/tools/ploidymaps/humgen_g1k_v37_ploidy.map' '-genderMapFile' '/cnv/wgs/process/gender_map_file.txt' '-md' '/cnv/wgs/process/preprocess1010/metadata' -disableGATKTraversal '-I' 'cnv1010_v2/run/bam_headers/merged_headers.bam' '-intervalList' '1' '-scannedWindowsVcfFile' 'cnv1010_v2/run/cnv_stage1/seq_1/seq_1.sites.vcf.gz' '-tilingWindowSize' '1000' '-tilingWindowOverlap' '500' '-maximumReferenceGapLength' '1000'
ERROR 15:53:48,765 FunctionEdge - Contents of /cnv/wgs/process/cnv1010_v2/logs/CNVDiscoveryPipeline-188.out:
"

Here is the logs/CNVDiscoveryPipeline-188.out:
"
INFO 15:47:55,688 QScriptManager - Compiling 4 QScripts
INFO 15:48:04,690 QScriptManager - Compilation complete
INFO 15:48:04,775 HelpFormatter - ----------------------------------------------------------------------
INFO 15:48:04,775 HelpFormatter - Queue v3.3.GS2-0-g7ad6c61, Compiled 2015/05/15 09:12:56
INFO 15:48:04,776 HelpFormatter - Copyright (c) 2012 The Broad Institute
INFO 15:48:04,776 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 15:48:04,776 HelpFormatter - Program Args: -cp /cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar -S /cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryStage1.q -S /cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryStageBase.q -S /cnv/wgs/tools/svtoolkit/qscript/discovery/cnv/CNVDiscoveryGenotyper.q -S /cnv/wgs/tools/svtoolkit/qscript/SVQScript.q -gatk /cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar -jobLogDir cnv1010_v2/run/cnv_stage1/seq_1/logs -memLimit 64.0 -jobRunner Lsf706 -gatkJobRunner Lsf706 -jobQueue long -run -sequenceName 1 -runDirectory cnv1010_v2/run/cnv_stage1/seq_1 -sentinelFile cnv1010_v2/run/cnv_sentinel_files/stage_1_seq_1.sent --disableJobReport -configFile /cnv/wgs/tools/svtoolkit/conf/genstrip_parameters.txt -R /cnv/build37/build37.fasta -ploidyMapFile /cnv/wgs/tools/ploidymaps/humgen_g1k_v37_ploidy.map -genderMapFile /cnv/wgs/process/gender_map_file.txt -md /cnv/wgs/process/preprocess1010/metadata -disableGATKTraversal -I cnv1010_v2/run/bam_headers/merged_headers.bam -intervalList 1 -scannedWindowsVcfFile cnv1010_v2/run/cnv_stage1/seq_1/seq_1.sites.vcf.gz -tilingWindowSize 1000 -tilingWindowOverlap 500 -maximumReferenceGapLength 1000
INFO 15:48:04,777 HelpFormatter - Executing on Linux 3.0.0-16-server amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_60-b27.
INFO 15:48:04,777 HelpFormatter - Date/Time: 2015/10/25 15:48:04
INFO 15:48:04,778 HelpFormatter - ----------------------------------------------------------------------
INFO 15:48:04,778 HelpFormatter - ----------------------------------------------------------------------
INFO 15:48:04,785 QCommandLine - Scripting CNVDiscoveryStage1
INFO 15:48:04,850 QCommandLine - Added 2 functions
INFO 15:48:04,850 QGraph - Generating graph.
INFO 15:48:04,860 QGraph - Running jobs.
INFO 15:48:05,018 FunctionEdge - Starting: 'java' '-Xmx65536m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/cnv/wgs/process/.queue/tmp' '-cp' '/cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar' '-cp' '/cnv/wgs/tools/svtoolkit/lib/SVToolkit.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/cnv/wgs/tools/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.discovery.SVDepthScanner' '-O' '/cnv/wgs/process/cnv1010_v2/run/cnv_stage1/seq_1/seq_1.sites.vcf.gz' '-R' '/cnv/build37/build37.fasta' '-genderMapFile' '/cnv/wgs/process/gender_map_file.txt' '-md' '/cnv/wgs/process/preprocess1010/metadata' '-configFile' '/cnv/wgs/tools/svtoolkit/conf/genstrip_parameters.txt' '-L' '1' '-tilingWindowSize' '1000' '-tilingWindowOverlap' '500' '-maximumReferenceGapLength' '1000'
INFO 15:48:05,018 FunctionEdge - Output written to /cnv/wgs/process/cnv1010_v2/run/cnv_stage1/seq_1/logs/CNVDiscoveryStage1-1.out
INFO 15:48:05,320 Lsf706JobRunner - Submitted LSF job id: 2899255
INFO 15:48:05,322 QGraph - 1 Pend, 1 Run, 0 Fail, 0 Done
INFO 15:48:19,612 QCommandLine - Shutting down jobs. Please wait...
INFO 15:48:19,729 QGraph - 1 Pend, 1 Run, 0 Fail, 0 Done
INFO 15:48:19,731 QCommandLine - Writing final jobs report...
INFO 15:48:19,731 QCommandLine - Done with errors
INFO 15:48:19,734 QCommandLine - Script failed: 1 Pend, 1 Run, 0 Fail, 0 Done


LSBATCH: User input

sh /cnv/wgs/process/.queue/tmp/.exec3257920008945804962

TERM_MEMLIMIT: job killed after reaching LSF memory usage limit.
Exited with exit code 130.

Resource usage summary:

CPU time   :     77.14 sec.
Max Memory :      1383 MB
Max Swap   :     68588 MB

Max Processes  :         4
Max Threads    :        28

"

Could you help me to take a look what's going on?

Thanks.

Ming

Answers

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    You are supplying -memLimit 64 (64G), which is more than Genome STRiP needs and apparently more than your LSF configuration allows. Most GS processes are designed to run in a 4G java heap (plus you need memory for the java runtime). Queue sometimes requires more memory than this for large data sets. I usually run with a memory limit of 12G, just in case, but 8G is usually fine.

Sign In or Register to comment.