"Could not find or load main class", but my classpath looks correct

jason.harrisjason.harris Menlo Park, CAMember
edited August 2015 in GenomeSTRiP

Trying to run SVPreprocess; I am getting "Could not find or load main class" on the SVCommandLine class. My setup looks correct to me, so I am hoping for some advice.

Sanity checks:
$ java -version java version "1.7.0_02" Java(TM) SE Runtime Environment (build 1.7.0_02-b13) Java HotSpot(TM) 64-Bit Server VM (build 22.0-b10, mixed mode)

$ echo ${SV_DIR} /nfs/projects/home/jharris/Code/svtoolkit

$ java -jar ${SV_DIR}/lib/SVToolkit.jar SVToolkit version 2.00 (build 1602) Build date: 2015/07/21 09:43:14 Web site: http://www.broadinstitute.org/software/genomestrip

unzip -v ${SV_DIR}/lib/SVToolkit.jar | grep SVCommandLine 8299 Defl:N 3874 53% 07-03-2015 11:16 c59d5638 org/broadinstitute/sv/main/SVCommandLine.class

Here is my command line:
java -Xmx4g \ -cp ${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar \ org.broadinstitute.gatk.queue.QCommandLine \ -S ${SV_DIR}/qscript/SVQScript.q \ -S ${SV_DIR}/qscript/SVPreprocess.q \ -cp ${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar \ -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \ -configFile ${SV_DIR}/conf/genstrip_parameters.txt \ -R ${SV_DIR}/1000G_phase3/human_g1k_hs37d5.fasta \ -I URB875A1_bam.list \ -md URB875A1_metadata \ -bamFilesAreDisjoint true -jobRunner Drmaa -gatkJobRunner Drmaa -jobProject test_preprocess \ -jobQueue [email protected] -jobNative \"-V\" --disableJobReport -jobLogDir URB875A1_metadata/log -run

26 jobs fail; 24 are SVCommandLine jobs, the other two are ComputeGenomeSizes and ComputeGCProfiles. Here are the contents of a failed job's .out file:
Error: Could not find or load main class org.broadinstitute.sv.main.SVCommandLine


Best Answers


  • jason.harrisjason.harris Menlo Park, CAMember

    Thank you! Looks like using " '-V' " instead of \"-V\" for my jobNative argument did the trick.

  • jason.harrisjason.harris Menlo Park, CAMember
    edited August 2015

    I spoke too soon. The majority of the jobs now run to successful completion. But I had two jobs fail with the same "could not find or load main class" error. How is it possible that 94 jobs found their Java class successfully, but these two did not?

    INFO 17:01:09,886 FunctionEdge - Output written to /hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/log/SVQScript-65.out INFO 17:01:09,898 DrmaaJobRunner - Submitted job id: 32220 INFO 17:01:09,898 QGraph - 958 Pend, 6 Run, 0 Fail, 94 Done ERROR 17:03:09,801 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/.queue/tmp' '-cp' '/nfs/projects/home/jharris/Code/svtoolkit/lib/SVToolkit.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/Queue.jar' '-cp' '/nfs/projects/home/jharris/Code/svtoolkit/lib/SVToolkit.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.IndexReadCountFile' '-I' '/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/rccache/URB875A1.chromosome_5.recal.rc.bin' '-O' '/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/rccache/URB875A1.chromosome_5.recal.rc.bin.idx' ERROR 17:03:09,872 FunctionEdge - Contents of /hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/log/SVQScript-65.out: Error: Could not find or load main class org.broadinstitute.sv.apps.IndexReadCountFile ERROR 17:03:09,877 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/.queue/tmp' '-cp' '/nfs/projects/home/jharris/Code/svtoolkit/lib/SVToolkit.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/Queue.jar' '-cp' '/nfs/projects/home/jharris/Code/svtoolkit/lib/SVToolkit.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/GenomeAnalysisTK.jar:/nfs/projects/home/jharris/Code/svtoolkit/lib/gatk/Queue.jar' 'org.broadinstitute.sv.apps.IndexReadCountFile' '-I' '/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/rccache/URB875A1.chromosome_4.recal.rc.bin' '-O' '/hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/rccache/URB875A1.chromosome_4.recal.rc.bin.idx' ERROR 17:03:09,913 FunctionEdge - Contents of /hpc/research/users/jharris/SVModule/GeisingerGold/Genome_STRiP/URB875A1_metadata/log/SVQScript-63.out: Error: Could not find or load main class org.broadinstitute.sv.apps.IndexReadCountFile INFO 17:03:09,914 QGraph - Writing incremental jobs reports... INFO 17:03:09,949 QGraph - 958 Pend, 4 Run, 2 Fail, 94 Done

    By the way, job execution continued after those error messages were written. The final tally was:
    INFO 19:05:39,973 QCommandLine - Script failed: 952 Pend, 0 Run, 2 Fail, 104 Done

    Also, the class does seem to exist:
    $ unzip -v ${SV_DIR}/lib/SVToolkit.jar | grep IndexReadCountFile 1900 Defl:N 934 51% 07-03-2015 11:16 d5807afb org/broadinstitute/sv/apps/IndexReadCountFile.class

  • skashinskashin Member ✭✭

    Are you still dispatching all the jobs to the same host? It might be that the host executing the jobs has intermittent errors accessing the SV toolkit installation directory over the NFS.
    If you rerun the top-level script, Queue will re-execute the 2 jobs that failed.

  • jason.harrisjason.harris Menlo Park, CAMember

    When you say "rerun the top-level script", do you mean simply repeat the same command line? I didn't think that would skip over jobs that already completed. I tried repeating my command line without the "-run" argument, and it labeled all 1000+ jobs as "Pending", suggesting that it was going to redo everything (that isn't going to help me if I can continue to expect some random fraction of the jobs to fail).

    Is there documentation available on how to resume a run such that only failed or pending jobs get resubmitted to the queue?

Sign In or Register to comment.