job supposedly running for 9 hrs, but no output... and no evidence of execution

bhaasbhaas Broad InstituteMember, Broadie

Hi,

I've got a bunch of jobs that have been running for ~9 hours (and they're expensive jobs, using moderate-mem machines), and I can't seem to find evidence of them doing anything. The buckets just have the script, but no stderr or out, etc.

An example workflow id is: 310990d8-3f18-4ee3-b217-10572be61b45

in case that's helpful.

Any ideas? Thx in advance.

Best Answer

Answers

  • bhaasbhaas Broad InstituteMember, Broadie

    it seems there was a rather long time delay, as later on the information was updated and bucket was populated.

    Is there a way to see for any given task how long it was being officially charged for on the system? (as in google compute core hours or something?) I'm curious about differences between what I'm seeing in logs and in the FC web reporting and what I'm seeing as my own logs in the buckets.

  • bhaasbhaas Broad InstituteMember, Broadie

    For example, I've got a job (workflow id: fc-1f65e310-4bf0-4601-8d9a-1715de51a4cb) that was launched a couple of days ago, it crashed a couple of days ago, and it's still in the 'running' status where the system (cromwell?) is trying to copy a result file back over to a bucket and that output file simply doesn't exist because the job crashed. Usually these jobs get failure status instead of locked in run mode, and I'm curious whether this hanging for 2 days is actually something that's accruing charges.

Sign In or Register to comment.