Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Genome strip: script failed with error

mavershangmavershang Austin,TxMember
edited December 2014 in GenomeSTRiP

Hello. I am using Genome strip to do CNV analysis. Unfortunately, the script failed with errors.

The most common error message is

ERROR 09:20:03,131 FunctionEdge - Error:  'java'  '-Xmx2048m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  '-Djava.io.tmpdir=/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/tmpdir'  '-cp' '/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/SVToolkit.jar:/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/gatk/Queue.jar'  '-cp' '/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/SVToolkit.jar:/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit/lib/gatk/Queue.jar'  'org.broadinstitute.sv.apps.ReduceInsertSizeHistograms'  '-I' '/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult/WGS_Samples/metadata/isd/WGS_1411.hist.bin'  '-O' '/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult/WGS_Samples/metadata/isd/WGS_1411.dist.bin'  
 ERROR 09:20:03,131 FunctionEdge - Contents of /home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult/WGS_Samples/logs/SVPreprocess-14.out:
INFO  07:20:15,533 HelpFormatter - ------------------------------------------------------------------- 
INFO  07:20:15,535 HelpFormatter - Program Name: org.broadinstitute.sv.apps.ReduceInsertSizeHistograms 
INFO  07:20:15,539 HelpFormatter - Program Args: -I /home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult/WGS_Samples/metadata/isd/WGS_1411.hist.bin -O /home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult/WGS_Samples/metadata/isd/WGS_1411.dist.bin 
INFO  07:20:15,539 HelpFormatter - Date/Time: 2014/12/03 07:20:15 
INFO  07:20:15,539 HelpFormatter - ------------------------------------------------------------------- 
INFO  07:20:15,539 HelpFormatter - ------------------------------------------------------------------- 
Processing WGS_1411/1/null ...
INFO  07:20:22,337 CommandLineProgram - Program completed.

My script is based on installtest/discovery.sh as below

SV_DIR=/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/tools/svtoolkit
SV_TMPDIR=${SV_DIR}/tmpdir

outDir=/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/result/GenomeStrip.Ressult
resDir=/home/hgidnyc/Documents/Lei_WorkingDir/WorkWithCollegaue/Aziz/WES_WGS_Paper/resources

runDir=${outDir}/WGS_Samples
bam=${outDir}/WGS_Bam.list
gender=${outDir}/WGS_Gender.map
sites=${outDir}/WGS_Samples.discovery.vcf
genotypes=${outDir}/WGS_Samples.genotypes.vcf

# These executables must be on your path.
which java > /dev/null || exit 1
which Rscript > /dev/null || exit 1
which samtools > /dev/null || exit 1

# For SVAltAlign, you must use the version of bwa compatible with Genome STRiP.
export PATH=${SV_DIR}/bwa:${PATH}
export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}

mx="-Xmx10g"
classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar"

mkdir -p ${runDir}/logs || exit 1
mkdir -p ${runDir}/metadata || exit 1

# Unzip the reference sequence and masks if necessary
#if [ ! -e data/human_b36_chr1.fasta -a -e data/human_b36_chr1.fasta.gz ]; then
#    gunzip data/human_b36_chr1.fasta.gz
#fi
#if [ ! -e data/human_b36_chr1.mask.fasta -a -e data/human_b36_chr1.mask.fasta.gz ]; then
#    gunzip data/human_b36_chr1.mask.fasta.gz
#fi
#if [ ! -e data/cn2_mask_g1k_b36_chr1.fasta -a -e data/cn2_mask_g1k_b36_chr1.fasta.gz ]; then
#    gunzip data/cn2_mask_g1k_b36_chr1.fasta.gz
#fi

# Display version information.
java -cp ${classpath} ${mx} -jar ${SV_DIR}/lib/SVToolkit.jar

# Run preprocessing.
# For large scale use, you should use -reduceInsertSizeDistributions, but this is too slow for the installation test.
# The method employed by -computeGCProfiles requires a CN2 copy number mask and is currently only supported for human genomes.
java -cp ${classpath} ${mx} \
    org.broadinstitute.sting.queue.QCommandLine \
    -S ${SV_DIR}/qscript/SVPreprocess.q \
    -S ${SV_DIR}/qscript/SVQScript.q \
    -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
    --disableJobReport \
    -cp ${classpath} \
    -configFile ${SV_DIR}/conf/genstrip_parameters.txt \
    -tempDir ${SV_TMPDIR} \
    -R ${resDir}/human_g1k_v37.fasta \
    -genomeMaskFile ${resDir}/human_g1k_v37.mask.100.fasta \
    -ploidyMapFile ${resDir}/humgen_g1k_v37_ploidy.map \
    -copyNumberMaskFile ${resDir}/cn2_mask_g1k_v37.fasta \
    -reduceInsertSizeDistributions \
    -genderMapFile ${gender} \
    -runDirectory ${runDir} \
    -md ${runDir}/metadata \
    -useMultiStep \
    -computeGCProfiles \
    -jobLogDir ${runDir}/logs \
    -I ${bam} \
    -run \
    || exit 1

if [ "$?" != "0" ]; then
    echo "1st step failed. Now exit!!!"
    exit 1
else
    echo "1st step finished"
fi

# Run discovery.
java -cp ${classpath} ${mx} \
    org.broadinstitute.sting.queue.QCommandLine \
    -S ${SV_DIR}/qscript/SVDiscovery.q \
    -S ${SV_DIR}/qscript/SVQScript.q \
    -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
    --disableJobReport \
    -cp ${classpath} \
    -configFile ${SV_DIR}/conf/genstrip_parameters.txt \
    -tempDir ${SV_TMPDIR} \
    -R ${resDir}/human_g1k_v37.fasta \
    -genomeMaskFile ${resDir}/human_g1k_v37.mask.100.fasta \
    -genderMapFile ${gender} \
    -runDirectory ${runDir} \
    -md ${runDir}/metadata \
    -jobLogDir ${runDir}/logs \
    -L 1 \
    -minimumSize 100 \
    -maximumSize 1000000 \
    -suppressVCFCommandLines \
    -I ${bam} \
    -O ${sites} \
    -run \
    || exit 1

(grep -v ^##fileDate= ${sites} | grep -v ^##source= | grep -v ^##reference= | diff -q - benchmark/${sites}) \
    || { echo "Error: test results do not match benchmark data"; exit 1; }

# Run genotyping on the discovered sites.
java -cp ${classpath} ${mx} \
    org.broadinstitute.sting.queue.QCommandLine \
    -S ${SV_DIR}/qscript/SVGenotyper.q \
    -S ${SV_DIR}/qscript/SVQScript.q \
    -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
    --disableJobReport \
    -cp ${classpath} \
    -configFile ${SV_DIR}/conf/genstrip_parameters.txt \
    -tempDir ${SV_TMPDIR} \
    -R ${resDir}/human_g1k_v37.fasta \
    -genomeMaskFile ${resDir}/human_g1k_v37.mask.100.fasta \
    -genderMapFile ${gender} \
    -runDirectory ${runDir} \
    -md ${runDir}/metadata \
    -jobLogDir ${runDir}/logs \
    -I ${bam} \
    -vcf ${sites} \
    -O ${genotypes} \
    -run \
    || exit 1

(grep -v ^##fileDate= ${genotypes} | grep -v ^##source= | grep -v ^##contig= | grep -v ^##reference= | diff -q - benchmark/${genotypes}) \
    || { echo "Error: test results do not match benchmark data"; exit 1; }

Thanks.
Lei

Tagged:

Answers

  • gtazgtaz NetherlandsMember

    I'm facing the same errors with the ReduceInsertSizeHistograms...has this been solved/addressed yet?

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    I'm not sure why there wasn't more to this thread.

    From the description, it appears that the program (java ... org.broadinstitute.sv.apps.ReduceInsertSizeHistograms ...) completed successfully but Queue (the workflow running process) saw an error reported by the workflow manager (e.g. LSF or SGE, depending on the environment). This kind of thing is generally a problem in your environment. You can try cutting/pasting the java command and running it explicitly or in a shell script wrapper to try to figure out why there is a non-zero exit status being returned.

Sign In or Register to comment.