Getting lots of "PAPI error code 10. Message: 14" errors

ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

Many of the jobs I launched last night have failed with "PAPI error code 10. Message: 14" errors. First, can we change this error message to something more helpful? And do we know what this error means and why it's happening?
(Note: it should have nothing to do with pre-emptibles, because I'm not using them)

Here's an example:
"message: Task BamToUnmappedBams.RevertSam:NA:1 failed. The job was stopped before the command finished. PAPI error code 10. Message: 14: VM ggp-11800192118132082400 stopped unexpectedly."

Tagged:

Comments

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    I've moved this thread to our feature request section of the forum, so folks can upvote this if they would like to see it implemented as well. Thank you for the suggestion!

  • ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

    Hi @KateN,

    I'd like to gently push back on placing this within the feature request section. This wasn't a feature request -- my workflows all failed and I have no clue why. I've been waiting a week for someone to tell me what happened and I still have no information... but now I need people to vote on my ticket in order to get more information?

    The GATK methods team (among others) has been asking for better error messages in FireCloud/Cromwell for pretty much forever. If you need me to get them all to vote for that then I will.

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    You are absolutely correct @ebanks, I misunderstood your question here. I've created a separate feature request post to address your first question:

    can we change this error message to something more helpful?

    To address your second question, this error message typically does have to do with pre-emption, however there are other reasons that your VM could have been closed and still attribute this error code. To look into this further, I am going to need the following.

    1. In FireCloud, please share your workspace with [email protected]
    2. In this forum thread, please specify the name of the workspace and the submission & workflow ID where you saw this error message.
  • RuchiRuchi Member, Broadie, Dev admin

    Hey @ebanks ,

    Agreed the error should change to be more clear, and Cromwell is going to work on fixing up multiple PAPI errors in an upcoming release.

    In addition -- we've observed with Pipelines API v1 that sometimes non-preemptible machines fail with a preemption message. Its a bug that's been filed with the Pipelines API team -- they are not going to make any changes to v1, but they will if this persists with their v2 API. FireCloud should be updating to Pipelines API v2 soon -- and we've not seen his behavior with v2 from Cromwell testing. I recommend waiting until the v2 API is in use, and reporting it to the Pipelines team, and/or modifying Cromwell to handle the retries directly.

    Let me know if I missed anything!

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Ah, thanks for that clarification, @Ruchi! @ebanks, if her information did not fully answer your question, feel free to still share your workspace. Otherwise, disregard my earlier message.

  • ebanksebanks Broad InstituteMember, Broadie, Dev ✭✭✭✭

    No, this is perfect, thanks. Will report if I see this again when we're using v2.

Sign In or Register to comment.