Latest Release: 11/14/18
Release Notes can be found here.

joint discovery error in 3 samples of WGS

jin0008jin0008 Seoul, South KoreaMember

When I ran this execution in fire cloud. I got the error message.
message: Task JointGenotyping.ImportGVCFs:23:1 failed. JES error code 5. Message: 10: Failed to delocalize files: failed to copy the following files: "/mnt/local-disk/genomicsdb.tar -> gs://fc-675a604c-d4aa-4520-81c1-48ede8300786/4c9747b9-a69f-4270-834d-9f6e3472ab8e/JointGenotyping/cf4bcaaf-d2c2-4436-b814-2d1c3d8f29d2/call-ImportGVCFs/shard-23/genomicsdb.tar (cp failed: gsutil -q -m cp -L /var/log/google-genomics/out.log /mnt/local-disk/genomicsdb.tar gs://fc-675a604c-d4aa-4520-81c1-48ede8300786/4c9747b9-a69f-4270-834d-9f6e3472ab8e/JointGenotyping/cf4bcaaf-d2c2-4436-b814-2d1c3d8f29d2/call-ImportGVCFs/shard-23/genomicsdb.tar, command failed: CommandException: No URLs matched: /mnt/local-disk/genomicsdb.tar\nCommandException: 1 file/object could not be transferred.\n)"
Can you help me?

Best Answer

  • KateNKateN Cambridge, MA admin
    Accepted Answer

    Usually the Failed to localize files error doesn't mean that your files were unable to be localized, but that something prevented the system from finding the files. This can range from something as simple as your workflow not running properly (and thus the files not being generated), to a more complicated bug. The latter is not very common.

    Could you please share the workspace you encountered this error in with [email protected] so I can take a look? I will also need the name of the workspace itself (found in the top right when you are in your workspace), and the Submission ID (found in the Monitor tab when you go to click on the specific submission where you encountered this error).

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin
    Accepted Answer

    Usually the Failed to localize files error doesn't mean that your files were unable to be localized, but that something prevented the system from finding the files. This can range from something as simple as your workflow not running properly (and thus the files not being generated), to a more complicated bug. The latter is not very common.

    Could you please share the workspace you encountered this error in with [email protected] so I can take a look? I will also need the name of the workspace itself (found in the top right when you are in your workspace), and the Submission ID (found in the Monitor tab when you go to click on the specific submission where you encountered this error).

  • jin0008jin0008 Seoul, South KoreaMember

    I shared my project. Unfortunately, I deleted my original workspace due to above errors.
    I thought that my problems might be caused by initial pre-process of data.
    I don't know what flowcell_unmapped_bam_list file is, so I attached Unaligned bam list of NA12878.
    How can I produce flowcell-unmapped-bam-list file?
    I shared my workspace with [email protected]

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Ah, unfortunately I won't be able to diagnose the error without being able to look at the error message itself. If you encounter it again, please do let me know and I can take a look.

    For the any file that you don't know what it should be, if the author of the method should have some documentation telling you what they expect to be there. You can usually find this in the Documentation section on the Summary page. If there is no documentation, or if the documentation doesn't describe the input you have a question on (in this case flowcell_unmapped_bam_list), you can try looking in the method configuration.

    When an author of a method publishes, they can also choose to include a method configuration. A method configuration is simply all of the inputs and outputs filled out how they expect them to be for a workspace they ran on. If there is a published method configuration for a method, you would find it under the Method Configurations tab when looking at the method you want to use in the Method Repository.

    If neither of these options have information for you, and the method is one published by us, please let me know the name of the method and I will be sure to fix any lapses in our documentation and find the answer for you.

  • Hi Kate,
    I had a similar problem. When looking into the stderr.log in shard, it seems to be following

    "terminate called after throwing an instance of 'VCF2TileDBException'
    what(): VCF2TileDBException : Incorrect cell order found - cells must be in column major order. Previous cell: [ 0, 59000037 ] current cell: [ 0, 59000037 ]."

    The .g.vcf.gz and .g.vcf.gz.tbi were from successful HaplotypeCaller workflows in the same workspace.

    I've shared the workspace with [email protected]

    Please help. Thank you!

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Thank you for sharing the workspace. What is the name of the workspace, as well as the submission ID and workflow ID where you encountered this error message? I'd like to take a look.

  • Hi Kate,
    Submission ID: 9eccd9a2-acea-4a91-95a5-e23b5b977d79
    workflow ID: 9895e17d-ad7a-4cfd-b2e7-91a622cfc593

    call-ImportGVCFs/shard-0

    Thanks in advance.

  • Hi Kate, @KateN

    I ran joint discovery again. This time with or without NA12878.
    Identical errors appeared in both submissions, but in less shards in the one with NA12878.

    They are in the same workspace.
    The one without NA12878 is
    submission ID: 1608440a-ae3f-4178-8189-acefde749273
    workflow ID: fbf3118d-fcc3-439c-bd02-83f665942fd1

    and the one without NA12878 is
    submission ID: a769f972-59f1-4249-a8cf-0dc71f3d910e
    workflow ID: 352c0855-4539-4cf0-a987-6bfc6bc58c23

    The submission mentioned in March 22 comments has been deleted.

    Thanks in advance.

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    I'm so sorry I never got back to you on this earlier; It seems to have slipped through the cracks and I thank you for your patience.

    What is the name of the workspace these are in? Once I have that, I can take a look at the error messages and loop in a developer to help diagnose the issue.

  • workspace bucket ID: fc-e01bec48-cc89-4f63-bfb0-ae360cde563c

    Thanks a lot!

  • workspace: fccredits-curium-ecru-4604/F25-26-31-32_20180321_Germline-SNPs-Indels-GATK4-hg38_copy

    Thanks a lot!

  • Hi, Kate,
    I had the problem solved.

    Thanks a lot!

  • Hi, Kate, @KateN
    I had the problem solved.

    Thanks a lot!

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Fantastic, I'm glad you were able to solve your problem. What ended up being the solution, if you don't mind reporting back?

  • Hi Kate, @KateN
    It was solved by moving one WGS sample from the same cohort to the WES workspace for joint discovery based on the requirements/expectations in the method document.

    Thank you very much for your excellent teamwork and support :smile:

  • Hi Kate,
    The problem appeared again. The samples were all the same but this time it was run for hg38. (the successful one was for b37).

    Message for the failed workflow is "failed to copy the following files: "/mnt/local-disk/genomicsdb.tar" to shards in call-ImportGVCFs.

    Errors in the stderr for shards without genomicsdb.tar are similar, VCF2TileDBException : Incorrect cell order found - cells must be in column major order. Previous cell: [ 2, 1739753574 ] current cell: [ 2, 1739753574 ].

    The workspace, fccredits-curium-ecru-4604/F25-26-31-32_20180321_Germline-SNPs-Indels-GATK4-hg38_copy, is shared with [email protected]

    submission ID: 2b1843e4-69bc-4f4f-9e0e-38d66b1028dc
    workflow ID: 525c93fc-268f-485e-a844-bc4a4540fb96

    Please take a look. Thank you very much!

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Ah, I see. This is a different error than before.

    23:20:37.143 INFO  GenomicsDBImport - Importing batch 1 with 6 samples
    terminate called after throwing an instance of 'VCF2TileDBException'
      what():  VCF2TileDBException : Incorrect cell order found - cells must be in column major order. Previous cell: [ 3, 58999899 ] current cell: [ 3, 58999899 ].
    The most likely cause is unexpected data in the input file:
    (a) A VCF file has two lines with the same genomic position
    (b) An unsorted CSV file
    (c) Malformed VCF file (or malformed index)
    See point 2 at: https://github.com/Intel-HLS/GenomicsDB/wiki/Importing-VCF-data-into-GenomicsDB#organizing-your-data
    

    According to this message, the VCF file passed in for that particular shard (I'm looking at shard 0 for this one) has an issue. They list 3 possible options, and you might want to check out the URL they linked for additional information on troubleshooting your file. Unfortunately, as this isn't a GenomicsDBImport error, it is out of my area of expertise, so those would be my suggestions.

Sign In or Register to comment.