We've moved!
You can find our new documentation site and support forum for posting questions here.

haplotypecaller_gvcf_gatk4 failed to delocalize files


I was running haplotypecaller_gvcf_gatk4 for four samples in the same workspace. All but one finished the analysis. Message for the failed sample (all 50 shards) is as the following,

Task HaplotypeCallerGvcf_GATK4.HaplotypeCaller:5:1 failed. Job exit code 1. Check gs://fc-8da20bb3-0689-423f-b94b-8c196afd7a82/17ad2991-dab2-45c7-a68f-d5a31995f9c4/HaplotypeCallerGvcf_GATK4/5c9a1b93-7f39-424f-950e-70542e19e0b1/call-HaplotypeCaller/shard-5/HaplotypeCaller-5-stderr.log for more information. PAPI error code 5. Message: 10: Failed to delocalize files: failed to copy the following files: "/mnt/local-disk/HB3hg382.g.vcf.gz -> gs://fc-8da20bb3-0689-423f-b94b-8c196afd7a82/17ad2991-dab2-45c7-a68f-d5a31995f9c4/HaplotypeCallerGvcf_GATK4/5c9a1b93-7f39-424f-950e-70542e19e0b1/call-HaplotypeCaller/shard-5/HB3hg382.g.vcf.gz (cp failed: gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/HB3hg382.g.vcf.gz gs://fc-8da20bb3-0689-423f-b94b-8c196afd7a82/17ad2991-dab2-45c7-a68f-d5a31995f9c4/HaplotypeCallerGvcf_GATK4/5c9a1b93-7f39-424f-950e-70542e19e0b1/call-HaplotypeCaller/shard-5/HB3hg382.g.vcf.gz, command failed: CommandException: No URLs matched: /mnt/local-disk/HB3hg382.g.vcf.gz\nCommandException: 1 file/object could not be transferred.\n)"

Please help.

Thanks a lot.


Best Answers


  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    So that error message, while long and complex looking, is actually telling you to open up a certain file for more information.

    Check gs://fc-8da20bb3-0689-423f-b94b-8c196afd7a82/17ad2991-dab2-45c7-a68f-d5a31995f9c4/HaplotypeCallerGvcf_GATK4/5c9a1b93-7f39-424f-950e-70542e19e0b1/call-HaplotypeCaller/shard-5/HaplotypeCaller-5-stderr.log for more information.

    When you open that file, it should have a message giving you further information about why your workflow failed.

  • Thank you for pointing out the stderr.log.

    I cannot tell what went wrong from the file. (arguments at the end omitted)

    Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/cromwell_root/tmp.yGGSSl
    01:51:51.886 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/build/install/gatk/lib/gkl-0.8.3.jar!/com/intel/gkl/native/libgkl_compression.so
    01:51:52.308 INFO HaplotypeCaller - ------------------------------------------------------------
    01:51:52.308 INFO HaplotypeCaller - The Genome Analysis Toolkit (GATK) v4.0.1.2
    01:51:52.308 INFO HaplotypeCaller - For support and documentation go to https://software.broadinstitute.org/gatk/
    01:51:52.309 INFO HaplotypeCaller - Executing as [email protected] on Linux v4.9.0-0.bpo.5-amd64 amd64
    01:51:52.309 INFO HaplotypeCaller - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11
    01:51:52.309 INFO HaplotypeCaller - Start Date/Time: February 22, 2018 1:51:51 AM UTC
    01:51:52.309 INFO HaplotypeCaller - ------------------------------------------------------------
    01:51:52.309 INFO HaplotypeCaller - ------------------------------------------------------------
    01:51:52.310 INFO HaplotypeCaller - HTSJDK Version: 2.14.1
    01:51:52.310 INFO HaplotypeCaller - Picard Version: 2.17.2
    01:51:52.310 INFO HaplotypeCaller - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    01:51:52.310 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    01:51:52.311 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    01:51:52.311 INFO HaplotypeCaller - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    01:51:52.311 INFO HaplotypeCaller - Deflater: IntelDeflater
    01:51:52.311 INFO HaplotypeCaller - Inflater: IntelInflater
    01:51:52.311 INFO HaplotypeCaller - GCS max retries/reopens: 20
    01:51:52.311 INFO HaplotypeCaller - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
    01:51:52.311 INFO HaplotypeCaller - Initializing engine
    01:51:53.038 INFO IntervalArgumentCollection - Processing 58696488 bp from intervals
    01:51:53.051 INFO HaplotypeCaller - Done initializing engine
    01:51:53.142 INFO HaplotypeCallerEngine - Standard Emitting and Calling confidence set to 0.0 for reference-model confidence output
    01:51:53.142 INFO HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output
    01:51:53.143 INFO HaplotypeCaller - Shutting down engine

    [February 22, 2018 1:51:53 AM UTC] org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCaller done. Elapsed time: 0.02 minutes.
    USAGE: HaplotypeCaller [arguments]

    Call germline SNPs and indels via local re-assembly of haplotypes

    Required Arguments:.....

  • stderr.log for samples completed the run had the following log after "HaplotypeCallerEngine - All sites annotated with PLs forced to true for reference-model confidence output"

    07:36:30.908 INFO NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/gatk/build/install/gatk/lib/gkl-0.8.3.jar!/com/intel/gkl/native/libgkl_utils.so
    07:36:30.972 INFO NativeLibraryLoader - Loading libgkl_pairhmm_omp.so from jar:file:/gatk/build/install/gatk/lib/gkl-0.8.3.jar!/com/intel/gkl/native/libgkl_pairhmm_omp.so
    07:36:31.053 WARN IntelPairHmm - Flush-to-zero (FTZ) is enabled when running PairHMM
    07:36:31.054 INFO IntelPairHmm - Available threads: 2
    07:36:31.054 INFO IntelPairHmm - Requested threads: 4
    07:36:31.054 WARN IntelPairHmm - Using 2 available threads, but 4 were requested

    instead of "HaplotypeCaller - Shutting down engine"

    Does the WARNing explain why libgkl_utils.so was not loaded and the engine was shut down?

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Would you share the workspace with [email protected]? I'd like to take a look at your stderr log. They're often quite verbose, particularly if the line you're looking for isn't one you've seen before.

    I will also need the name of the workspace as well as the submission ID and workflow ID for the particular submission/workflow you are seeing this error on.

  • mychung3265mychung3265 Member

    Hi Kate,
    I've shared the workspace, fccredits-curium-ecru-4604/Germline-SNPs-Indels-GATK4-hg38_copy.
    Submission ID: b2c0147e-008b-4492-a888-d9a5aa76f165
    workflow ID: f03eea47-d055-4693-8c84-4eff29d176a9

    Thank you!

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Thank you for your patience; I've had a chance to look through your workspace, and I found the following error:

    A USER ERROR has occurred: Argument --emitRefConfidence has a bad value: Can only be used in single sample mode currently. Use the sample_name argument to run on a single sample out of a multi-sample BAM file.

    This error was in the stderr file for each of your sharded HaplotypeCaller calls. It would appear that the reason your run failed was because you ran the workflow on a multi-sample BAM. The error message above describes how you can either alter the WDL (the method) to use the sample_name argument, or you can split your multi-sample BAM into single-sample BAMs and re-run the workflow on each of those BAMs.

  • mychung3265mychung3265 Member

    Hi Kate,
    Thank you for replying. The sample HB3 is a single sample with 19 paired fastq files from two runs due to insufficient reads in the first run. Would this result in being recognized as a mutli-sample BAM file?

    Should I merge the R1 reads and R2 reads prior to paired-end mapping?

    Thanks again!

  • mychung3265mychung3265 Member

    Hi Geraldine and Kate,

    Glad to know the right and efficient way to analyze cases like this, l will split the runs for other similar ones.

    Thank you both very much for your help.

Sign In or Register to comment.