ConcatenateLogsFunction Error

dbottomlydbottomly Oregon Health and Science UniversityMember

Hi all:

When using IndelRealigner with Queue (v3.3) we are getting an error from ConcatenateLogsFunction with regards to one of the log files being missing:

ERROR 15:07:41,484 FunctionEdge - Error: Concat: List(/share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/scatter/scatter.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/temp_1_of_5/realigned.bam.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/temp_2_of_5/realigned.bam.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/temp_3_of_5/realigned.bam.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/temp_4_of_5/realigned.bam.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/temp_5_of_5/realigned.bam.out, /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/gather-out/gather-realigned.bam.out) > /share/data/resources/gatk_v3.3/tests/scala_test_out/realigned.bam.out
org.broadinstitute.gatk.queue.QException: Unable to find log: /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/gather-out/gather-realigned.bam.out
at org.broadinstitute.gatk.queue.function.scattergather.ConcatenateLogsFunction.run(ConcatenateLogsFunction.scala:50)
at org.broadinstitute.gatk.queue.engine.InProcessRunner.start(InProcessRunner.scala:53)
at org.broadinstitute.gatk.queue.engine.FunctionEdge.start(FunctionEdge.scala:84)
at org.broadinstitute.gatk.queue.engine.QGraph.runJobs(QGraph.scala:434)
at org.broadinstitute.gatk.queue.engine.QGraph.run(QGraph.scala:156)
at org.broadinstitute.gatk.queue.QCommandLine.execute(QCommandLine.scala:171)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.queue.QCommandLine$.main(QCommandLine.scala:62)
at org.broadinstitute.gatk.queue.QCommandLine.main(QCommandLine.scala)

The contents of the folder /share/data/resources/gatk_v3.3/tests/.queue/scatterGather/gatk-2-sg/gather-out/ is: gather-realigned.bam.utt

So it appears that the name of this file is getting mangled at some point by Queue. The other parts of the pipeline we have tried so far seem to work (BaseRecalibrator, RealignerTargetCreator) so not sure if it BAM output specific.

We are utilizing Queue/GATK (3.3-0-geee94ec) which has been cloned from the gatk-protected repository in conjunction with a custom jobRunner for HTCondor. We can provide additional info as needed.

Any thoughts would be appreciated.

Thanks,

Dan

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Dan,

    This is a really weird error. Have you checked that it reproduces consistently?

  • dbottomlydbottomly Oregon Health and Science UniversityMember

    Hi Geraldine:

    I agree and there is no obvious (to me) cause in the code I have gone through so far.

    Yes, I've run it several times on different days with different variations of the bam file name and it occurs consistently in one form or another depending on the provided file names.

    I don't want to use up your valuable time if this is an issue that is specific to our situation. Do you have any additional Queue debugging tips that could help?

    Thanks again,

    Dan

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi Dan,

    Maybe try testing another version of GATK/Queue just in case it's a problem that was introduced recently, which would help narrow things down. Otherwise, I'm afraid I can't think of anything that would help. Our resident Queue expert @kshakir is super busy on a high priority project, so I don't think he can take any time to look at this (but I'm at-mentioning him just in case -- no pressure, Khalid).

  • dbottomlydbottomly Oregon Health and Science UniversityMember

    Hi Geraldine:

    Thanks for the advice. Unfortunately the issue was still there after downgrading to v3.2. After some more testing and looking through the code this issue does not appear to be an issue with Queue itself as far as I can tell but perhaps DRMAA. Sorry for the noise.

    For those interested, my current solution is to supply the output and error redirection paths directly to HTCondor using the native specs and not supply them to the DRMAA interface. As HTCondor does not seem to have a mechanism for combining stdout and stderr into a single file, specifying the output and error files as part of the Qfunctions seems to allow for the appropriate treatment of the files.

    Thanks again for your help,

    Dan

  • bbimberbbimber HomeMember

    Hi Dan,

    I'm interested in queue/HTCondor as well. Would you be willing to share more details on your htcondor JobRunner?

    Thanks,
    Ben

  • dbottomlydbottomly Oregon Health and Science UniversityMember

    Hi Ben:

    Sure. We put together a basic github repo with the current version of the driver: https://github.com/biodev/HTCondor_drivers along with basic instructions on how to get up and running with GATK/Queue v3.3.

    It is fairly basic but hopefully it is useful.

    Dan

Sign In or Register to comment.