Update: July 26, 2019
This section of the forum is now closed; we are working on a new support model for WDL that we will share here shortly. For Cromwell-specific issues, see the Cromwell docs and post questions on Github.

Cromwell terminates unexpectedly in Google cloud. UnknownHostException: genomics.googleapis.com

Hi,

I am trying to run a WGS pipeline, but it's stopping at Fastq to bam step complaining about "java.net.UnknownHostException: genomics.googleapis.com". A small test fastq file can successfully complete the whole pipeline in google cloud, but when I tried to use a WGS fastq, it's failing below. Any advice why it's doing that? I don't quite understand why it's getting the error message.

Below is part of a log of the failure. I am using cromwell version 36.

[2019-01-28 21:42:15,24] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.prefastqc:NA:1]: Status change from - to Running
[2019-01-28 21:43:56,72] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.ScatterIntervalList:NA:1]: Status change from Running to Success
[2019-01-28 23:36:16,31] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.prefastqc:NA:1]: Status change from Running to Success
[2019-01-29 03:11:59,22] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.trim:NA:1]: Status change from Running to Success
[2019-01-29 03:12:00,73] [info] WorkflowExecutionActor-6f4b745c-0c1f-4b6e-84c6-0d768079e2d3 [^[[38;5;2m6f4b745c^[[0m]: Starting W1.postfastqc, W1.PairedFastQsToUnmappedBAM
[2019-01-29 03:12:03,99] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.PairedFastQsToUnmappedBAM:NA:1]: ^[[38;5;5m/gatk/gatk --java-options "-Xmx3000m" \
FastqToSam \
--FASTQ /cromwell_root/pca_binf_test/bill-w1-cromwell-execution/W1/6f4b745c-0c1f-4b6e-84c6-0d768079e2d3/call-trim/NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001.R1.fastq.gz_R1_trim.fq.gz \
--FASTQ2 /cromwell_root/pca_binf_test/bill-w1-cromwell-execution/W1/6f4b745c-0c1f-4b6e-84c6-0d768079e2d3/call-trim/NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001.R1.fastq.gz_R2_trim.fq.gz \
--OUTPUT NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001.unmapped.bam \
--READ_GROUP_NAME NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001 \
--SAMPLE_NAME NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001 \
--LIBRARY_NAME NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001 \
--PLATFORM_UNIT NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001 \
--RUN_DATE 2019-01-25T22:14:37 \
--PLATFORM illumina \
--SEQUENCING_CENTER BI^[[0m
[2019-01-29 03:12:07,52] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.postfastqc:NA:1]: ^[[38;5;5mfastqc -t 4 --outdir $PWD /cromwell_root/pca_binf_test/bill-w1-cromwell-execution/W1/6f4b745c-0c1f-4b6e-84c6-0d768079e2d3/call-trim/NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001.R1.fastq.gz_R1_trim.fq.gz /cromwell_root/pca_binf_test/bill-w1-cromwell-execution/W1/6f4b745c-0c1f-4b6e-84c6-0d768079e2d3/call-trim/NEUCV649UJK_ATTCAGAA_HJY2KCCXX_L1_4_001.R1.fastq.gz_R2_trim.fq.gz^[[0m
[2019-01-29 03:12:36,15] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.PairedFastQsToUnmappedBAM:NA:1]: job id: operations/EKup76mJLRiajZ6C85PAoG4g6OKq67wbKg9wcm9kdWN0aW9uUXVldWU
[2019-01-29 03:12:36,15] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.postfastqc:NA:1]: job id: operations/EJyp76mJLRjazoGLuIr6o6UBIOjiquu8GyoPcHJvZHVjdGlvblF1ZXVl
[2019-01-29 03:13:07,02] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.PairedFastQsToUnmappedBAM:NA:1]: Status change from - to Running
[2019-01-29 03:13:07,02] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.postfastqc:NA:1]: Status change from - to Running
[2019-01-29 04:30:18,60] [info] Message [cromwell.docker.DockerHashActor$DockerHashFailedResponse] from Actor[akka://cromwell-system/user/HealthMonitorDockerHashActor#-2047665085] to Actor[akka://cromwell-system/deadLetters] was not delivered. [1] dead letters encountered, no more dead letters will be logged. If this is not an expected behavior, then [Actor[akka://cromwell-system/deadLetters]] may have terminated unexpectedly, This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[2019-01-29 04:35:15,79] [info] PipelinesApiAsyncBackendJobExecutionActor [^[[38;5;2m6f4b745c^[[0mW1.postfastqc:NA:1]: Status change from Running to Success
[2019-01-29 05:43:27,62] [^[[38;5;1merror^[[0m] The JES API worker actor Actor[akka://cromwell-system/user/SingleWorkflowRunnerActor/JES-Singleton/PAPIQueryManager/PAPIQueryWorker-aba5cceb-6c41-421d-842c-becadaf4269a#-146857563] unexpectedly terminated while conducting 1 polls. Making a new one...
java.net.UnknownHostException: genomics.googleapis.com
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:673)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1334)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1309)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:259)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at com.google.api.client.googleapis.batch.BatchRequest.execute(BatchRequest.java:241)
at cromwell.backend.google.pipelines.common.api.PipelinesApiRequestWorker.runBatch(PipelinesApiRequestWorker.scala:56)
at cromwell.backend.google.pipelines.common.api.PipelinesApiRequestWorker.cromwell$backend$google$pipelines$common$api$PipelinesApiRequestWorker$$handleBatch(PipelinesApiRequestWorker.scala:50)
at cromwell.backend.google.pipelines.common.api.PipelinesApiRequestWorker$$anonfun$receive$1.applyOrElse(PipelinesApiRequestWorker.scala:35)
at akka.actor.Actor.aroundReceive(Actor.scala:517)
at akka.actor.Actor.aroundReceive$(Actor.scala:515)
at cromwell.backend.google.pipelines.common.api.PipelinesApiRequestWorker.aroundReceive(PipelinesApiRequestWorker.scala:19)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
at akka.actor.ActorCell.invoke(ActorCell.scala:557)

Answers

  • bshifawbshifaw Member, Broadie, Moderator admin

    @mwlee ,
    I've moved this thread to the WDL/Cromwell forum, the team here are more knowledgeable regarding cromwell error messages and would be most apt to help with this message.

  • mcovarrmcovarr Cambridge, MAMember, Broadie, Dev ✭✭

    It looks like Cromwell was trying to contact PAPI but was unable to resolve the PAPI hostname genomics.googleapis.com from the server on which Cromwell was running. Cromwell relies on DNS lookup working correctly on its server to be able to reach PAPI.

  • mwleemwlee Member

    But why does a small sample fastq file manage to complete while a WGS file is having this error?

  • mcovarrmcovarr Cambridge, MAMember, Broadie, Dev ✭✭

    Can you try again with the WGS file? Errors like this can occur due to temporary network problems.

  • GrantDalyGrantDaly Member
    I have been having a similar issue, where I can run small workflows but larger workflows fail. I have tried with Cromwell 36 and now 37, and am getting the same type of error message running a small wgs workflow.

    I just ran a small test workflow (the test with the WDL runner), which ran correctly, so I think the scale of this WGS workflow may be the issue.

    The relevant error code appears to be "cromwell_driver INFO: Failed to connect to Cromwell (attempt 1): ('Connection aborted.', error(99, 'Cannot assign requested address'))"

    #More extended log. The error appears to start fairly early in the workflow. I have only included what I think to be relevant lines, in the interest of brevity. I've also removed URL's because apparently those are blacklisted by the site.
    2019/02/12 19:31:58 Listening on [::]:22...
    2019-02-12 19:31:59,844 sys_util INFO: CROMWELL->/cromwell/cromwell.jar
    2019-02-12 19:31:59,844 sys_util INFO: CROMWELL_CONF->/cromwell/jes_template.conf
    2019-02-12 19:31:59,845 discovery INFO: URL being requested: GET
    ...
    2019-02-12 19:32:00,363 discovery INFO: URL being requested: GET ...
    fields=nextPageToken%2Citems%28name%29&prefix=mitochondrial-DNA%2Fmt-DNA-17-09-26%2Faligned-files%2FSL26687&alt=json&maxResults=2
    2019-02-12 19:32:00,463 cromwell_driver INFO: Started Cromwell
    2019-02-12 19:32:00,464 wdl_runner INFO: starting
    2019-02-12 19:32:01,945 INFO - Running with database db.url = jdbc:hsqldb:mem:${slick.uniqueSchema};shutdown=false;hsqldb.tx=mvcc
    2019-02-12 19:32:05,481 cromwell_driver INFO: Failed to connect to Cromwell (attempt 1): ('Connection aborted.', error(99, 'Cannot assign requested address'))
    2019-02-12 19:32:08,505 INFO - Successfully acquired change log lock
    2019-02-12 19:32:10,288 INFO - Creating database history table with name: PUBLIC.DATABASECHANGELOG
    2019-02-12 19:32:10,293 INFO - Reading from PUBLIC.DATABASECHANGELOG
    2019-02-12 19:32:10,489 cromwell_driver INFO: Failed to connect to Cromwell (attempt 2): ('Connection aborted.', error(99, 'Cannot assign requested address'))

    ...

    2019-02-12 19:32:11,229 INFO - changelog.xml: changesets/embiggen_metadata_value.xml::entry_or_journal_existence_xor::mcovarr: ChangeSet changesets/embiggen_metadata_value.xml::entry_or_journal_existence_xor::mcovarr ran successfully in 28ms
    2019-02-12 19:32:11,244 INFO - changelog.xml: changesets/embiggen_metadata_value.xml::embiggen_metadata_entry::mcovarr: Marking ChangeSet: changesets/embiggen_metadata_value.xml::embiggen_metadata_entry::mcovarr ran despite precondition failure due to onFail='MARK_RAN':
    changelog.xml : Table PUBLIC.METADATA_ENTRY does not exist

    ...

    2019-02-12 19:32:11,273 INFO - changelog.xml: changesets/job_store_tinyints.xml::job_store_fix_job_retryable_failure::mcovarr: JOB_STORE.RETRYABLE_FAILURE datatype was changed to BOOLEAN
    2019-02-12 19:32:11,273 INFO - changelog.xml: changesets/job_store_tinyints.xml::job_store_fix_job_retryable_failure::mcovarr: ChangeSet changesets/job_store_tinyints.xml::job_store_fix_job_retryable_failure::mcovarr ran successfully in 1ms
    2019-02-12 19:32:11,274 INFO - changelog.xml: changesets/job_store_tinyints.xml::job_store_not_nullable_job_successful::mcovarr: Null constraint has been added to JOB_STORE.JOB_SUCCESSFUL

    Does anyone have an idea what the issue could be? I have seen some other threads which suggest maybe the initial node lacks resources. I have modified my pipeline YAML file, as follows, so I suspect this isn't the issue.

    resources:
    virtualMachine:
    machineType: n1-standard-4
    bootDiskSizeGb: 30
Sign In or Register to comment.