Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

CNV pipeline on SGE

Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

Hi,

I've got my deletion pipeline working on D.melanogaster, by running discovery on the major chromosomes separately, (and ignoring the unmapped scaffolds that I think were causing the problem).

I'm trying to run the CNV pipeline now, with these commands on a Sun Grid Engine:

#$ -pe openmp 20
#$ -S /bin/sh
#$ -cwd
#$ -j y
#$ -q bioinf.q
. /etc/profile.d/modules.sh
set -ex
module load sge
module load genomestrip/2.0
module load jre/1.7.0_25

SV_TMPDIR=./tmpdir
SV_DIR=/cm/shared/apps/svtoolkit/2.0.1602/

# Set input variables
runDir=lhm_rg_gstrip_small
bams=lhm_RG_bams.list
ref_seq=local_ref/dm6.fa
my_config=adjuvants/genstrip_test3_parameters.txt
genome_mask=ref_metadata/dm6.svmask.fasta
depth_mask=ref_metadata/dm6.rdmask.bed
ploidy=adjuvants/ploidy_dm6.map
gender_map=adjuvants/gstrip_lhm_rg_gender.map
out_pf=lhm_rg_gstrip_CNV_small

    which java > /dev/null || exit 1
    which Rscript > /dev/null || exit 1
    which samtools > /dev/null || exit 1

            export PATH=${SV_DIR}/bwa:${PATH}
            export LD_LIBRARY_PATH=${SV_DIR}/bwa:${LD_LIBRARY_PATH}
mx="-Xmx4g"
classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar"
java -cp ${classpath} ${mx} -jar ${SV_DIR}/lib/SVToolkit.jar


## DISCOVERING CNVs, chr2L
java -cp ${classpath} ${mx} org.broadinstitute.gatk.queue.QCommandLine \
 -S ${SV_DIR}/qscript/discovery/cnv/CNVDiscoveryPipeline.q \
 -S ${SV_DIR}/qscript/SVQScript.q \
 -jobRunner Drmaa \
    -gatkJobRunner Drmaa \
        -cp ${classpath} \
    -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
        -configFile ${my_config} \
            -R ${ref_seq} \
            -I ${bams} \
                -genderMapFile ${gender_map} \
                -md ${runDir}/metadata \
                -runDirectory ${runDir} \
                -jobLogDir ${runDir}/logs \
                -genomeMaskFile ${genome_mask} \
                -readDepthMaskFile ${depth_mask} \
                -ploidyMapFile ${ploidy} \
            -L chr2L \
        -tempDir ${SV_TMPDIR} \
    -disableGATKTraversal \
    -maximumSize 10000 \
    -minimumSize 200 \
    -tilingWindowSize 1000 \
    -tilingWindowOverlap 500 \
    -maximumReferenceGapLength 1000 \
    -boundaryPrecision 100 \
    -minimumRefinedLength 500 \
    -debug true \
    -run \
    || exit 1

However, the job errors-out with:

ERROR 12:22:42,928 FunctionEdge - Error:  'java'  '-Xmx2048m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  '-Djava.io.tmpdir=/lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/cnvs/tmpdir'  '-cp' '/cm/shared/apps/svtoolkit/2.0.1602//lib/SVToolkit.jar:/cm/shared/apps/svtoolkit/2.0.1602//lib/gatk/GenomeAnalysisTK.jar:/cm/shared/apps/svtoolkit/2.0.1602//lib/gatk/Queue.jar'  '-cp' '/cm/shared/apps/svtoolkit/2.0.1602/lib/SVToolkit.jar:/cm/shared/apps/svtoolkit/2.0.1602/lib/gatk/GenomeAnalysisTK.jar:/cm/shared/apps/svtoolkit/2.0.1602/lib/gatk/Queue.jar'  'org.broadinstitute.sv.apps.ExtractBAMSubset'  '-I' '/lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/cnvs/lhm_RG_bams.list'  '-O' '/lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/cnvs/lhm_rg_gstrip_small/bam_headers/merged_headers.bam'  '-L' 'NONE'  
ERROR 12:22:43,219 FunctionEdge - Contents of /lustre/scratch/bioenv/wg39/LHm_analysis/genotyping/cnvs/lhm_rg_gstrip_small/logs/CNVDiscoveryPipeline-1.out:
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/broadinstitute/sv/apps/ExtractBAMSubset : Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)

I've attached the full log. It looks as though ExtractBAMSubset isn't being loaded properly. I'm not sure what the cause of this, or how to correct it. Any advice would be much appreciated.

Sincerely,

William Gilks

Best Answer

Answers

Sign In or Register to comment.