To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

SVPreprocess sequencial failure

Hello, I am running pre-processing script given below :

#

java -cp ${classpath} ${mx} \
org.broadinstitute.gatk.queue.QCommandLine \
-S ${SV_DIR}/qscript/SVPreprocess.q \
-S ${SV_DIR}/qscript/SVQScript.q \
-gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
-cp ${classpath} \
-configFile conf/genstrip_installtest_parameters.txt \
-tempDir ${SV_TMPDIR} \
-R /local/workdir/SOFTWARE/svtoolkit/SV_METADATA_DIR/1000G_phase1_20101123_mdv1/reference/human_g1k_v37.fasta \
-ploidyMapFile /local/workdir/SOFTWARE/svtoolkit/SV_METADATA_DIR/1000G_phase1_20101123_mdv1/reference/human_g1k_v37.ploidy.map \
-genomeMaskFile /local/workdir/SOFTWARE/svtoolkit/SV_METADATA_DIR/1000G_phase1_20101123_mdv1/svmasks/human_g1k_v37.mask.36.fasta \
-copyNumberMaskFile /local/workdir/SOFTWARE/svtoolkit/SV_METADATA_DIR/1000G_phase1_20101123_mdv1/cn2/cn2_mask_g1k_v37.fasta \
-bamFilesAreDisjoint true \
-reduceInsertSizeDistributions true \
-genderMapFile m_gender_1000G.list \
--disableJobReport \
-runDirectory ${runDir} \
-md ${runDir}/metadata \
-disableGATKTraversal \
-useMultiStep \
-computeGCProfiles true \
-computeReadCounts true \
-jobLogDir ${runDir}/logs \
-I ${bam} \
-memLimit 12 \
-run \
|| exit 1

#

It works after all, but only after I run it three times:
In the first run, two jobs fails:

1

ERROR 15:00:58,411 FunctionEdge - Error: 'java' '-Xmx12288m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmp
dir=/local/workdir/SOFTWARE/svtoolkit/installtest/tmpdir' '-cp' '/local/workdir/SOFTWARE/svtoolkit/lib/SVToolkit.jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/GenomeAnalysisTK.
jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/Queue.jar' '-cp' '/local/workdir/SOFTWARE/svtoolkit/lib/SVToolkit.jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/GenomeAnalysisTK.
jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.ReduceInsertSizeHistograms' '-I' '/SSD/prerun_pk5_1000G/metadata/isd/HG00101.mapped.ILLUMI
NA.bwa.GBR.low_coverage.20130415.hist.bin' '-O' '/SSD/prerun_pk5_1000G/metadata/isd/HG00101.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.dist.bin'
ERROR 15:00:58,413 FunctionEdge - Contents of /SSD/prerun_pk5_1000G/logs/SVPreprocess-8.out:
INFO 14:53:59,420 HelpFormatter - -------------------------------------------------------------------
INFO 14:53:59,423 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ReduceInsertSizeHistograms
INFO 14:53:59,427 HelpFormatter - Program Args: -I /SSD/prerun_pk5_1000G/metadata/isd/HG00101.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.hist.bin -O /SSD/prerun_pk5_1000G/met
adata/isd/HG00101.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.dist.bin
INFO 14:53:59,431 HelpFormatter - Executing as pk352@cbsukoren.tc.cornell.edu on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13.
INFO 14:53:59,431 HelpFormatter - Date/Time: 2016/03/10 14:53:59
INFO 14:53:59,431 HelpFormatter - -------------------------------------------------------------------
INFO 14:53:59,432 HelpFormatter - -------------------------------------------------------------------
Processing HG00101/HG00101_I_bc_pelib_1018/null ...
INFO 14:54:05,331 CommandLineProgram - Program completed.

2.

ERROR 16:23:20,986 FunctionEdge - Error: 'java' '-Xmx12288m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmp
dir=/local/workdir/SOFTWARE/svtoolkit/installtest/tmpdir' '-cp' '/local/workdir/SOFTWARE/svtoolkit/lib/SVToolkit.jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/GenomeAnalysisTK.
jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/Queue.jar' '-cp' '/local/workdir/SOFTWARE/svtoolkit/lib/SVToolkit.jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/GenomeAnalysisTK.
jar:/local/workdir/SOFTWARE/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.MergeReadDepthCoverage' '-I' '/SSD/prerun_pk5_1000G/metadata/depth/HG00101.mapped.ILLUMINA
.bwa.GBR.low_coverage.20130415.depth.txt' '-I' '/SSD/prerun_pk5_1000G/metadata/depth/HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.depth.txt' '-O' '/SSD/prerun_pk5_1000G
/metadata/depth.dat'
ERROR 16:23:20,989 FunctionEdge - Contents of /SSD/prerun_pk5_1000G/logs/SVPreprocess-14.out:
INFO 16:13:06,231 HelpFormatter - ---------------------------------------------------------------
INFO 16:13:06,233 HelpFormatter - Program Name: org.broadinstitute.sv.apps.MergeReadDepthCoverage
INFO 16:13:06,238 HelpFormatter - Program Args: -I /SSD/prerun_pk5_1000G/metadata/depth/HG00101.mapped.ILLUMINA.bwa.GBR.low_coverage.20130415.depth.txt -I /SSD/prerun_pk5_1000G/
metadata/depth/HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.depth.txt -O /SSD/prerun_pk5_1000G/metadata/depth.dat
INFO 16:13:06,241 HelpFormatter - Executing as pk352@cbsukoren.tc.cornell.edu on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13.
INFO 16:13:06,242 HelpFormatter - Date/Time: 2016/03/10 16:13:06
INFO 16:13:06,242 HelpFormatter - ---------------------------------------------------------------
INFO 16:13:06,242 HelpFormatter - ---------------------------------------------------------------
INFO 16:13:06,247 CommandLineProgram - Program completed.

#

When I run the same script second time - those two jobs are completed successfully, but other couple jobs failed:
SVPreprocess-17: - org.broadinstitute.sv.apps.MergeReadSpanCoverage and
SVPreprocess-20: - org.broadinstitute.sv.apps.MergeGCProfiles
Only the third run finishes successfully:
INFO 10:02:15,521 QCommandLine - Script completed successfully with 962 total jobs
It worked for me on the test run but would not work when I will scale the process up. Any suggestions what I could try differently?
I had the same results with different source bam files including 1000Genome. In both cases I had two bam files in my bam file list, but when I tried preprocessing on a single bam file I had the same issue.
Best regards,
Pavel

Sign In or Register to comment.