Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Problem in running five-dollar-genome-analysis-pipeline

The running failed:

X6031_50_2_8
Workflow ID:2afe3bf7-9d41-4ae9-b934-11b064bebeef
Status:
Failed
Total Run Cost:n/a
Submitted:July 31, 2018, 2:20 PM (1 hour ago)
Started:July 31, 2018, 3:39 PM (36 minutes ago)
Ended:July 31, 2018, 3:44 PM (32 minutes ago)
Inputs:Show
Outputs:None
Workflow Log:workflow.2afe3bf7-9d41-4ae9-b934-11b064bebeef.log
Workflow Timing:Show
Failures:Show
Calls:
Failed: germline_single_sample_workflow.CreateSequenceGroupingTSVShow
Failed: germline_single_sample_workflow.flowcell_unmapped_bamsShow
Failed: germline_single_sample_workflow.ScatterIntervalListShow

Here is my metadata file:

entity:participant_id disease gender age
MORL-X6031_50 Otoscope M 53

entity:sample_id participant_id sample_type ubam
X6031_50_2_1 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_1.unmapped.bam
X6031_50_2_2 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_2.unmapped.bam
X6031_50_2_3 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_3.unmapped.bam
X6031_50_2_4 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_4.unmapped.bam
X6031_50_2_5 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_5.unmapped.bam
X6031_50_2_6 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_6.unmapped.bam
X6031_50_2_7 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_7.unmapped.bam
X6031_50_2_8 MORL-X6031_50 Hearing_Loss gs://fc-ec4fde6c-2875-418b-8eb5-942bc7a8f438/Finished/X6031_50_2_8.unmapped.bam

membership:sample_set_id sample_id
MORL-X6031_50_2 X6031_50_2_1
MORL-X6031_50_2 X6031_50_2_2
MORL-X6031_50_2 X6031_50_2_3
MORL-X6031_50_2 X6031_50_2_4
MORL-X6031_50_2 X6031_50_2_5
MORL-X6031_50_2 X6031_50_2_6
MORL-X6031_50_2 X6031_50_2_7
MORL-X6031_50_2 X6031_50_2_8

Tagged:

Answers

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    Hello @Gongshing,

    Unfortunately, your workflows failed due to the Cromwell 34 upgrade in yesterday's release. Normally, any jobs running at the time of release are automatically restarted, but there was a change in Cromwell that made them unable to automatically restart. Simply manually re-launch your workflows and they will call cache any work they did prior to the release.

    We are very sorry we were unable to better mitigate the impact of this release by forewarning you. You should have also received an email from Tiffany which may have a bit more detail than I gave here. Let us know if you need any additional help getting things restarted.

  • I run today with some modification. However, it still does not work.

    fccredits-uranium-cobalt-1573/MORLBGIFIVEDOLLAR
    submission id: fc-635cecae-6fc4-485d-988e-cba87499ecf8

    I shared with [email protected]

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    I'm not seeing that submission ID in that workspace. Could you double check that's the correct information?

  • Sorry. this is not the submission id:

    Here you are!
    dd48d658-922c-44af-adab-232066122ebd

  • thibaultthibault Broad InstituteMember, Broadie, Moderator, Dev admin

    Hello Gongshing.

    If you look at one of the workflows in that submission ...

    https://portal.firecloud.org/#workspaces/fccredits-uranium-cobalt-1573/MORLBGIFIVEDOLLAR/monitor/dd48d658-922c-44af-adab-232066122ebd/56e1cd58-caff-477a-8a36-f7598b549949

    ... you can click on Failures -> Show to see what the problem is.

    It can't find the files. They are not in the Workspace's Google Storage Bucket fc-635cecae-6fc4-485d-988e-cba87499ecf8 but I do see them in a sub-folder "Finished".

    Try updating your FOFN inputs file to include the full Google Storage paths to your inputs.

  • Please advise. Thanks

    workspace: ccredits-uranium-cobalt-1573/BAM2UBAM
    Submission ID: f7c1cb77-5c33-4cb7-8801-dc077319d2cc

    this submission better but still failed.

  • KateNKateN Cambridge, MAMember, Broadie, Moderator admin

    As that is a different workspace, you will need to share it with us again. Using the share button in FireCloud, please share with [email protected].

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @Gongshing - if you go to the Failures section you will see several error messages for the SamToFastqAndBwaMemAndMba workflow.

    The first one tells you to check the error log for shard 1 in your bucket: Check gs://fc-8a55a0f8-df82-4b4a-a123-b7f66638bbb6/f7c1cb77-5c33-4cb7-8801-dc077319d2cc/germline_single_sample_workflow/456d9466-199b-442a-8165-f4d023842421/call-SamToFastqAndBwaMemAndMba/shard-1/stderr for more information. PAPI error code 5. 10: Failed to delocalize files: failed to copy the following files: "/mnt/local-disk/X6031_50_2_2.aligned.unsorted.bam

    When I go to shard 1 and look at the error log it says there was a problem parsing the SAM file for sample X6031_50_2_2.aligned.unsorted.bam

    Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Not enough fields; File /dev/stdin; Line 1
    Line: TCTTACGGGATTTGTGGGATAGCATCAAAACGGCAAATGTTTGAGATATAGGAGTTTAAGAAAAGCAAAGTATATACCACGAATCC    [email protected]?FFEGGFFFFF;=FFFFFFFAGFGFFFFFFFFFFGFDFFFGFGFFFFFFFGFFDEFFFFFFFFCFCGGFFFFGGFFFFFD    NM:i:0  MD:Z:76 AS:i:76 XS:i:0
    

    I am now running the validate bam method to see if it gives us more information about the files.

    @EricB have you seen this sort of error before?

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Administrator, Broadie, Moderator admin

    Hi @Gongshing
    For your original submission, I looked further into each directory per shard and noticed that shards 0 and 1, which correspond to files X6031_50_2_1 and X6031_50_2_2 appear to not produce the output aligned.unsorted.bam , while it does appear to be generated for the other files like X6031_50_2_3.aligned.unsorted.bam and so on.

    I will have to consult with my colleagues next week on what we'd recommend doing about this, but hopefully, this helps in your investigation.

  • bshifawbshifaw Member, Broadie, Moderator admin

    @Gongshing
    The errors that were received from running validatesam on your input files were

    ## HISTOGRAM    java.lang.String
    Error Type  Count
    ERROR:RECORD_OUT_OF_ORDER
    

    To remove this error you'll want to run your files through SortSam as indicated in this git issue :https://github.com/zkamvar/read-processing/issues/11 . Once its sorted run it through ValidateSam again to make sure there aren't any other errors then it should be ready for processing.

    Also since this issue is not associated with Firecloud its been moved to the gatk forum, you may find their responses to be more useful.

    You may also find the following discussion useful as it takes you through the steps to get sequence data ready to be processed using ValidateSam

Sign In or Register to comment.