To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

"Failed to properly flush metadata to database" when reading a tsv to a map

tmajariantmajarian Member, Broadie
edited November 2017 in Ask the WDL team

I've come across an error message that I'm struggling to understand. The wdl, below, reads a tsv to a map:

task readMap {
    File tsv

    command {}

    output {
        Map[String,String] map_out = read_map(tsv)
    }
}

workflow w {
    File this_tsv
    call readMap { input: tsv = this_tsv }
}

Now, the wdl works fine when reading simple keys/values but fails when reading google bucket links (gs://) (and hangs indefinitely):

[2017-11-10 20:31:46,96] [info] Slf4jLogger started
[2017-11-10 20:31:47,02] [info] RUN sub-command
[2017-11-10 20:31:47,02] [info]   WDL file: /Users/tmajaria/Documents/projects/topmed/code/topmed-t2d-glycemia-public/methods/dataModel/readMap_test.wdl
[2017-11-10 20:31:47,02] [info]   Inputs: /Users/tmajaria/Documents/projects/topmed/code/topmed-t2d-glycemia-public/methods/dataModel/readMap_test.wdl.json
[2017-11-10 20:31:47,08] [info] SingleWorkflowRunnerActor: Submitting workflow
[2017-11-10 20:31:47,32] [info] Running with database db.url = jdbc:hsqldb:mem:84cd7995-2cec-4f71-8982-88c3d285385e;shutdown=false;hsqldb.tx=mvcc
[2017-11-10 20:31:53,29] [info] Running migration RenameWorkflowOptionsInMetadata with a read batch size of 100000 and a write batch size of 100000
[2017-11-10 20:31:53,30] [info] [RenameWorkflowOptionsInMetadata] 100%
[2017-11-10 20:31:53,41] [info] Metadata summary refreshing every 2 seconds.
[2017-11-10 20:31:53,44] [info] Workflow 82e6b2b0-1d1b-442d-87a8-19fba2294467 submitted.
[2017-11-10 20:31:53,44] [info] SingleWorkflowRunnerActor: Workflow submitted 82e6b2b0-1d1b-442d-87a8-19fba2294467
[2017-11-10 20:31:53,99] [info] 1 new workflows fetched
[2017-11-10 20:31:53,99] [info] WorkflowManagerActor Starting workflow 82e6b2b0-1d1b-442d-87a8-19fba2294467
[2017-11-10 20:31:54,00] [info] WorkflowManagerActor Successfully started WorkflowActor-82e6b2b0-1d1b-442d-87a8-19fba2294467
[2017-11-10 20:31:54,00] [info] Retrieved 1 workflows from the WorkflowStoreActor
[2017-11-10 20:31:54,22] [info] MaterializeWorkflowDescriptorActor [82e6b2b0]: Call-to-Backend assignments: w.readMap -> Local
[2017-11-10 20:31:56,53] [info] WorkflowExecutionActor-82e6b2b0-1d1b-442d-87a8-19fba2294467 [82e6b2b0]: Starting calls: w.readMap:NA:1
[2017-11-10 20:31:56,66] [info] BackgroundConfigAsyncJobExecutionActor [82e6b2b0w.readMap:NA:1]:
[2017-11-10 20:31:56,67] [info] BackgroundConfigAsyncJobExecutionActor [82e6b2b0w.readMap:NA:1]: executing: /bin/bash /Users/tmajaria/Documents/projects/topmed/code/topmed-t2d-glycemia-public/methods/dataModel/cromwell-executions/w/82e6b2b0-1d1b-442d-87a8-19fba2294467/call-readMap/execution/script
[2017-11-10 20:31:56,72] [info] BackgroundConfigAsyncJobExecutionActor [82e6b2b0w.readMap:NA:1]: job id: 77687
[2017-11-10 20:31:56,73] [info] BackgroundConfigAsyncJobExecutionActor [82e6b2b0w.readMap:NA:1]: Status change from - to WaitingForReturnCodeFile
[2017-11-10 20:31:58,19] [info] BackgroundConfigAsyncJobExecutionActor [82e6b2b0w.readMap:NA:1]: Status change from WaitingForReturnCodeFile to Done
[2017-11-10 20:31:58,58] [info] WorkflowExecutionActor-82e6b2b0-1d1b-442d-87a8-19fba2294467 [82e6b2b0]: Workflow w complete. Final Outputs:
{
  "w.readMap.map_out": {
    "gs://fc-fa093e72-dbcb-4028-ae82-609a79ced51a/4948058f-342e-4712-ab7c-ce88489a3698/w/9307fd63-67d3-4608-856e-c4655844d4b2/call-read/fc-d960a560-7e5c-4083-b61e-b2ea71ae5b14/passgt.minDP10-gds/freeze4.chr20.pass.gtonly.minDP10.genotypes.gds": "gs://fc-fa093e72-dbcb-4028-ae82-609a79ced51a/c98f10c6-089b-4673-8075-a53dc619fdd6/wf/ef457927-aac6-438f-8ea0-34debe957271/call-parse/shard-0/freezes_2a_3a_4.snp_indel.annotated.general20170422.subset.gz.chr20.csv"
  }
}
[2017-11-10 20:31:58,60] [info] WorkflowManagerActor WorkflowActor-82e6b2b0-1d1b-442d-87a8-19fba2294467 is in a terminal state: WorkflowSucceededState
[2017-11-10 20:32:03,46] [error] Failed to properly flush metadata to database
java.sql.BatchUpdateException: data exception: string data, right truncation;  table: METADATA_ENTRY column: METADATA_KEY
    at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown Source)
    at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:128)
    at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeBatch(HikariProxyPreparedStatement.java)
    at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$MultiInsertAction$$anonfun$run$10.apply(JdbcActionComponent.scala:533)
    at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$MultiInsertAction$$anonfun$run$10.apply(JdbcActionComponent.scala:527)
    at slick.jdbc.JdbcBackend$SessionDef$class.withPreparedStatement(JdbcBackend.scala:372)
    at slick.jdbc.JdbcBackend$BaseSession.withPreparedStatement(JdbcBackend.scala:434)
    at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl.preparedInsert(JdbcActionComponent.scala:502)
    at slick.jdbc.JdbcActionComponent$InsertActionComposerImpl$MultiInsertAction.run(JdbcActionComponent.scala:527)
    at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:31)
    at slick.jdbc.JdbcActionComponent$SimpleJdbcProfileAction.run(JdbcActionComponent.scala:28)
    at slick.dbio.DBIOAction$$anon$4$$anonfun$run$3.apply(DBIOAction.scala:240)
    at slick.dbio.DBIOAction$$anon$4$$anonfun$run$3.apply(DBIOAction.scala:240)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at slick.dbio.DBIOAction$$anon$4.run(DBIOAction.scala:240)
    at slick.dbio.DBIOAction$$anon$4.run(DBIOAction.scala:238)
    at slick.dbio.SynchronousDatabaseAction$FusedAndThenAction$$anonfun$run$4.apply(DBIOAction.scala:534)
    at slick.dbio.SynchronousDatabaseAction$FusedAndThenAction$$anonfun$run$4.apply(DBIOAction.scala:534)
    at scala.collection.Iterator$class.foreach(Iterator.scala:893)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at slick.dbio.SynchronousDatabaseAction$FusedAndThenAction.run(DBIOAction.scala:534)
    at slick.dbio.SynchronousDatabaseAction$$anon$11.run(DBIOAction.scala:571)
    at slick.basic.BasicBackend$DatabaseDef$$anon$2.liftedTree1$1(BasicBackend.scala:240)
    at slick.basic.BasicBackend$DatabaseDef$$anon$2.run(BasicBackend.scala:240)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Any ideas would be great!

Thanks.

Tagged:

Best Answer

Answers

Sign In or Register to comment.