Error with no Description

Dear GenomeStrip team,

My name is Elbay Aliyev. I am working as a Research Specialist at Sidra Medical and Research Center. We have huge project of 3000 Qatari Genome Project and we want to use genomestrip in our studies. But we are facing unexplainable error without any stacktrace during Preprocess step without description.

**Preprocess script:
java -cp ${classpath} ${mx} \
org.broadinstitute.gatk.queue.QCommandLine \
-S ${SV_DIR}/qscript/SVPreprocess.q \
-S ${SV_DIR}/qscript/SVQScript.q \
-gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \
--disableJobReport \
-cp ${classpath} \
-configFile conf/genstrip_installtest_parameters.txt \
-tempDir ${SV_TMPDIR} \
-R data/Homo_sapiens_assembly19.fasta \
-genomeMaskFile data/Homo_sapiens_assembly19.svmask.fasta \
-copyNumberMaskFile data/Homo_sapiens_assembly19.gcmask.fasta \
-genderMapFile data/ \
-runDirectory ${runDir} \
-md ${runDir}/metadata \
-disableGATKTraversal \
-useMultiStep \
-reduceInsertSizeDistributions false \
-computeGCProfiles true \
-computeReadCounts true \
-jobLogDir ${runDir}/logs \
-I ${inputFile} \
-P chimerism.use.correction:false \
-run \
|| exit 1

and a on a couple of steps we have errors like that.
ERROR 16:09:20,504 FunctionEdge - Error: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '' '-cp' '/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/SVToolkit.jar:/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/gatk/GenomeAnalysisTK.jar:/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/gatk/Queue.jar' '-cp' '/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/SVToolkit.jar:/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/gatk/GenomeAnalysisTK.jar:/gpfs/projects/tmedicine/ealiyev/genomestrip/lib/gatk/Queue.jar' '' '-O' '/gpfs/projects/tmedicine/ealiyev/genomestrip/PMC01/PMC01/metadata/profiles_100Kb/profile_seq_GL000232.1_100000.dat.gz' '-I' 'PMC01/metadata/headers.bam' '-configFile' 'conf/genstrip_installtest_parameters.txt' '-P' 'chimerism.use.correction:false' '-R' 'data/Homo_sapiens_assembly19.fasta' '-L' 'GL000232.1:0-0' '-genomeMaskFile' 'data/Homo_sapiens_assembly19.svmask.fasta' '-md' 'PMC01/metadata' '-profileBinSize' '100000' '-maximumReferenceGapLength' '10000'
ERROR 16:09:20,509 FunctionEdge - Contents of /gpfs/projects/tmedicine/ealiyev/genomestrip/PMC01/PMC01/logs/SVPreprocess-111.out:
INFO 16:04:10,442 HelpFormatter - -------------------------------------------------------------
INFO 16:04:10,445 HelpFormatter - Program Name:
INFO 16:04:10,449 HelpFormatter - Program Args: -O /gpfs/projects/tmedicine/ealiyev/genomestrip/PMC01/PMC01/metadata/profiles_100Kb/profile_seq_GL000232.1_100000.dat.gz -I PMC01/metadata/headers.bam -configFile conf/genstrip_installtest_parameters.txt -P chimerism.use.correction:false -R data/Homo_sapiens_assembly19.fasta -L GL000232.1:0-0 -genomeMaskFile data/Homo_sapiens_assembly19.svmask.fasta -md PMC01/metadata -profileBinSize 100000 -maximumReferenceGapLength 10000
INFO 16:04:10,453 HelpFormatter - Executing as ealiyev@hpcgenomicn26.research.sidra.local on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13.
INFO 16:04:10,454 HelpFormatter - Date/Time: 2017/09/24 16:04:10
INFO 16:04:10,454 HelpFormatter - -------------------------------------------------------------
INFO 16:04:10,454 HelpFormatter - -------------------------------------------------------------
INFO 16:04:10,461 ComputeDepthProfiles - Opening reference sequence ...
INFO 16:04:10,462 ComputeDepthProfiles - Opened reference sequence.
INFO 16:04:10,462 ComputeDepthProfiles - Opening genome mask ...
INFO 16:04:10,463 ComputeDepthProfiles - Opened genome mask.
INFO 16:04:10,465 MetaData - Opening metadata ...
INFO 16:04:10,466 MetaData - Adding metadata location PMC01/metadata ...
INFO 16:04:10,476 MetaData - Opened metadata.
INFO 16:04:10,476 ComputeDepthProfiles - Opened metadata.
INFO 16:04:10,476 ComputeDepthProfiles - Initializing input data set ...
INFO 16:04:10,513 ComputeDepthProfiles - Initialized data set: 1 file, 1 read group, 1 sample.
INFO 16:04:10,518 MetaData - Loading insert size histograms ...
INFO 16:04:11,940 ReadCountCache - Initializing read count cache with 1 file.

INFO 16:04:12,010 CommandLineProgram - Program completed.

Done. There were no warn messages.

Looks like script did his job perfect but still gives error. Same issue with SVPreprocess 6 and 11 :(

Thanks in advance.

echo "./" | bsub -n 16 -e /gpfs/projects/tmedicine/ealiyev/genomestrip/PMC01/test.err -o /gpfs/projects/tmedicine/ealiyev/genomestrip/PMC01/test.out -P PMC01.test

We are submitting to our LSF system like that. our LSF version is 10.1. Java version is 1.8.21. GATK we are using internal one provided with GATK.


  • idrakttidraktt DohaMember

    Interesting thing if you run this commands separately they are also generate succesful output. And if you rerun the queue script it will pass this commands with success. So what i did. On a first run it gives error on SVPreprocess 6 and 11. I run this commands separately. After i again rerun the queue script it again fails on one of the commands. so from 194 commands i have successful 192 for Preprocess. Again i rerun the whole script but with commenting preprocessing step so he moves directly to the discovery process. Again it fails on Preprocess-111. So i run this commands separately . Again i rerun the whole discovery process and after this step it is succeeds with generating normal outputs and with genotyping information. But the problem is i spent a lot of time for one sample. It is very tough to do all the things for more than one sample. So dear genomestrip team i really need your help to solve that challenge to use genomestrip in our studies.

    Maybe there are some uncaught exception that stops command executing so you can add uncaught exception handler to your source code to catch uncaught exceptions.

    Best Regards,
    Elbay Aliyev

  • I am receiving similar errors at our machines here at CERN Data Centre. I believe some exceptions are caught in a way which we can't see the errors from, which limit our debugging options.

    Also, one thing: Increasing memory in the java call, does not make the subprocesses run with more memory in most cases, which might be a reason for an issue as well (based on the debug logs)

    Looking forward to a solution on this.

    All the best,
    Taghi Aliyev
    Doctoral Student
    CERN OpenLab Knowledge Sharing Projects

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @idraktt @TaghiAliyev

    I moved your question to the GenomeSTRiP section where @bhandsaker can help.


  • bhandsakerbhandsaker Member, Broadie, Moderator

    I'd like to respond to two things:

    First, it is not best practice to use the installtest scripts as a model for how to do a real analysis. The installtest scripts take some shortcuts to make the tests run faster.

    In particular, the following settings should be changed:

    -configFile should be set to ${SV_DIR}/conf/genstrip_parameters.txt

    And the following parameters should generally be removed and allowed to assume their default values (the masks will default based on the -R argument, which should point to a reference sequence from a reference metadata bundle supported for use with Genome STRiP):

    -P chimerism.use.correction:false

    Second, the original problem being reported is a failure mode we sometimes see with Queue. Queue is software that runs pipelines by submitting individual jobs (in dependency order) to a supported jobRunner (e.g. LSF or SGE). In this case, you are trying to run these as shell subprocesses (jobRunner Shell). The symptom you are seeing is that the job probably completed successfully, but Queue does not recognize this. This can happen if something in your environment causes the job to have a non-zero exit status, or something goes awry in the handoff between Queue and the jobRunner (in this case the subshell).

    These problems are usually specific to your compute environment and can be difficult to debug. One suggestion is to simply retry the Queue command. This will cause Queue to only rerun the previously failed jobs, and in many cases the problems are intermittent and the spurious failures will go away. Another suggestion is to try using the jobRunner ParallelShell instead of Shell. The Shell code is not heavily used and seems to have more of these kinds of cryptic failures. ParallelShell is similar, but runs substantially different code. We have used it more and it seems to be more robust. The ParallelShell jobRunner will try to run multiple subshells in parallel (up to the number of cores on the current machine), but you can also throttle it back with -availableProcesssorsMultiplier (use a floating point value between 0 and 1 if you want to throttle back).

  • bhandsakerbhandsaker Member, Broadie, Moderator

    Also, in reference to the question about java heap size: Generally the default settings should work well unless you are running really large analyses (thousands of genomes). If you need to change the java memory settings, I believe you need to modify the Queue scripts which set the defaults for each job based on the expected memory usage.

Sign In or Register to comment.