StorageException when trying to use cloud_sql_proxy, CloudSQL (mysql), and cromwell server

mikexingmikexing Member
Hi,

I'm trying to run a cromwell server (v38, Google PAPIv2) with mysql connection to a CloudSQL database using cloud_sql_proxy.

*) This requires a localhost non-ssl connection between cloud_sql_proxy and cromwell (the proxy manages SSL & certs itself to the CloudSQL server)

```
./cloud_sql_proxy -instances=__my-project__:australia-southeast1:_host_=tcp:3306
```

*) Cromwell server starts up fine and is able to read/write the database:

```
$  java -Dconfig.file=google-38-sql.conf -jar ../../cromwell-38.jar server
2019-03-27 15:24:59,795 INFO - Running with database db.url = jdbc:mysql://localhost/bolt?rewriteBatchedStatements=true&useSSL=false
2019-03-27 15:25:04,175 INFO - Successfully acquired change log lock
2019-03-27 15:25:05,708 INFO - Reading from bolt.DATABASECHANGELOG
2019-03-27 15:25:05,972 INFO - Successfully released change log lock
2019-03-27 15:25:06,034 INFO - Running with database db.url = jdbc:mysql://localhost/bolt?rewriteBatchedStatements=true&useSSL=false
2019-03-27 15:25:06,513 INFO - Successfully acquired change log lock
2019-03-27 15:25:06,768 INFO - Reading from bolt.SQLMETADATADATABASECHANGELOG
2019-03-27 15:25:06,893 INFO - Successfully released change log lock
2019-03-27 15:25:07,343 INFO - Slf4jLogger started
2019-03-27 15:25:07,502 cromwell-system-akka.dispatchers.engine-dispatcher-5 INFO - Workflow heartbeat configuration:
{
"cromwellId" : "cromid-068169e",
"heartbeatInterval" : "2 minutes",
"ttl" : "10 minutes",
"writeBatchSize" : 10000,
"writeThreshold" : 10000
}
```

*) The config file database section is:

```
database {
profile = "slick.jdbc.MySQLProfile$"
db {
driver = "com.mysql.cj.jdbc.Driver"
url = "jdbc:mysql://localhost/bolt?rewriteBatchedStatements=true&useSSL=false"
user = "..."
password = "..."
connectionTimeout = 5000
}
}
```

*) I submit a job to this server, things look ok, no errors:

```
java -Dconfig.file=google-38-sql.conf -jar ../../cromwell-38.jar submit hello.wdl -i hello.inputs
[2019-03-27 15:25:34,76] [info] Slf4jLogger started
[2019-03-27 15:25:35,65] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:25:35,68] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:25:35,68] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:25:35,68] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:25:36,37] [info] Workflow bcc6b4be-bc4f-4331-a374-4a320f8eb94c submitted to http : // localhost:8000
```

*) but before any 'pipeline' is running in the google cloud console, I get the following job failure message in the job 'metadata' from cromwell:

```
......
"status": "Failed",
"failures": [
{
... ...
"causedBy": [],
"message": "unable to find valid certification path to requested target"
}
],
"message": "PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"
}
],
"message": "sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"
}
],
"message": "sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"
}
],
"message": "[Attempted 5 time(s)] - StorageException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"
}
],
"message": "Workflow failed"
}
],
"end": "2019-03-27T05:26:24.880Z",
"start": "2019-03-27T05:25:36.945Z"
}
```

*) There is no indication of a problem in the server output, just the submitted workflow

```
......
2019-03-27 15:25:36,212 cromwell-system-akka.dispatchers.api-dispatcher-22 INFO - Unspecified type (Unspecified version) workflow bcc6b4be-bc4f-4331-a374-4a320f8eb94c submitted
```

*) If I submit a second identical job to the server, the submit command now bombs out with an error:

```
java -Dconfig.file=google-38-sql.conf -jar ../../cromwell-38.jar submit hello.wdl -i hello.inputs
[2019-03-27 15:53:21,22] [info] Slf4jLogger started
[2019-03-27 15:53:22,10] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:53:22,10] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:53:22,11] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:53:22,11] [warn] No caching TTL defined. Using default value Ttl(30 seconds).
[2019-03-27 15:53:22,43] [info] Workflow 2991742a-801c-4bdd-8f2f-4d74c8ba58d4 submitted to http : // localhost:8000
[ERROR] [03/27/2019 15:53:22.484] [SubmitSystem-akka.actor.default-dispatcher-14] [akka://SubmitSystem/system/pool-master] connection pool for PoolGateway(hcps = HostConnectionPoolSetup(localhost,8000,ConnectionPoolSetup(ConnectionPoolSettings(4,0,5,32,1,100 milliseconds,2 minutes,30 seconds,ClientConnectionSettings(Some(User-Agent: akka-http/10.1.7),40 seconds,1 minute,512,None,WebSocketSettings(,ping,Duration.Inf,akka.http.impl.settings.WebSocketSettingsImpl$$$Lambda$433/[email protected]),List(),ParserSettings(2048,16,64,64,8192,64,8388608,8388608,256,1048576,Strict,RFC6265,true,Set(),Full,Error,Map(If-Range -> 0, If-Modified-Since -> 0, If-Unmodified-Since -> 0, default -> 12, Content-MD5 -> 0, Date -> 0, If-Match -> 0, If-None-Match -> 0, User-Agent -> 32),false,true,akka.util.ConstantFun$$$Lambda$286/[email protected],akka.util.ConstantFun$$$Lambda$286/[email protected],akka.util.ConstantFun$$$Lambda$287/[email protected]),None,TCPTransport),New,1 second),[email protected],[email protected]))) has shut down unexpectedly
```

*) The server log just shows the new workflow submitted but no error:

```
2019-03-27 15:53:22,363 cromwell-system-akka.dispatchers.api-dispatcher-270 INFO - Unspecified type (Unspecified version) workflow 2991742a-801c-4bdd-8f2f-4d74c8ba58d4 submitted
```

*) If I submit the same job to server that has the above database stanza commented out (eg, running with hsqldb), the job completes successfully and can continue running subsequent jobs. So something to do with mysql.

*) Has anyone seen this problem? Is it something to do with the 'useSSL=false' configuration, some cromwell db connection is expecting SSL certs?
Post edited by mikexing on

Answers

  • mikexingmikexing Member

    OK solved by creating a new MySQL cloud SQL instance with default networking.

    It was something to do with specifying SSL and certificates on the MySQL server, when Cloud SQL proxy also manages that...

    M

Sign In or Register to comment.