Latest Release: 8/9/18
Release Notes can be found here.

Error running "gatk/mutect2-gatk4"

@bshifaw @Tiffany_at_Broad @KateN,

Thanks a lot for your helps. I'm almost finished this analysis.

When I run mutect2 on the normal-tumor pair, I encountered this error for call #50 of Mutect2.M2:

"htsjdk.samtools.FileTruncatedException: Premature end of file: /f4653fc5-f4a9-4f7e-ab53-1725b639043f/PairedEndSingleSampleWorkflow/e25a51f7-139d-40b7-8a5a-e8b5a2417ef1/call-GatherBamFiles/example_tumor.bam"

However, example_tumor.bam was generated by gatk/pre-processing-b37-gatk4, which was finished successfully. So I do not think this BAM file is truncated.

To make sure it is not a random GCP problem, I rerun mutect2. This time I got a similar but different error message at call #50 of Mutect2.M2:

"htsjdk.samtools.FileTruncatedException: Premature end of file: /27d5041a-60b4-4be2-b8e2-185547225d35/PairedEndSingleSampleWorkflow/08661c64-3b4b-44d1-a10d-8cd5e6dabf0a/call-GatherBamFiles/example_fibroblast.bam"

Again, I do not think "example_fibroblast.bam" is truncated.

Any suggestions?

Thanks,
Bo

Best Answer

Answers

  • bshifawbshifaw moonMember, Broadie, Moderator

    Perhaps the gatk/pre-processing-b37-gatk4 was not successful (related post). Check through the stdout and stderr to make sure there arn't any errors messages in the pre-processing-b37-gatk4 method.
    Also the bam being processed may not meet certain criteria. You can validate bam using the following method

    If this doesn't help share your workspace with [email protected] and tell us the name of the workspace.

  • bigbadbobigbadbo Member, Broadie

    @bshifaw ,

    Thanks for your suggestion. I tried to run the validate bam WDL but failed with the following message:

    The specified GCS path '‎gs://fc-7178e78d-efba-440a-9de1-0f0ffa4585df/27d5041a-60b4-4be2-b8e2-185547225d35/PairedEndSingleSampleWorkflow/08661c64-3b4b-44d1-a10d-8cd5e6dabf0a/call-GatherBamFiles/sample.bam‎' does not parse as a URI. Illegal character in scheme name at index 0: %E2%80%8Egs://fc-7178e78d-efba-440a-9de1-0f0ffa4585df/27d5041a-60b4-4be2-b8e2-185547225d35/PairedEndSingleSampleWorkflow/08661c64-3b4b-44d1-a10d-8cd5e6dabf0a/call-GatherBamFiles/sample.bam%E2%80%8E

    Any suggestions?

    By the way, we have already shared the workspace with firecloud support, the workspace name is regev-ludwig/Ribo-seq

    Thanks,
    Bo

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Broadie, Moderator

    Thanks @bigbadbo we will take a look this morning and get back to you.

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Broadie, Moderator

    @bigbadbo can you double check it is shared with this group: [email protected]? Which submission_id did you see this error message with?

  • bshifawbshifaw moonMember, Broadie, Moderator

    Hey bo,

    One of our developers stated

    The original error - htsjdk.samtools.FileTruncatedException: Premature end of file would be completely unrelated. The followup comment about "Illegal character in scheme name" looks related on the surface - perhaps as a troubleshooting step the user copied-and-pasted from the UI table. BUT - the illegal characters "%E2%80%8E" don't match.

    This is a bug we currently addressing, in the mean time please avoid using illegal characters brought upon by copy and pasting gs:// paths, the following post may help in identifying these characters. https://gatkforums.broadinstitute.org/firecloud/discussion/comment/50524#Comment_50524

  • bigbadbobigbadbo Member, Broadie

    @bshifaw

    I have finished ValidateBam on the tumor sample. However, I got the following FireCloud error:

    Job ValidateBamsWf.ValidateBAM:0:1 exited with return code 4 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.

    If I looked at the stderr output:

    INFO 2018-07-24 20:07:58 SamFileValidator Validated Read 2,200,000,000 records. Elapsed time: 03:44:16s. Time for last 10,000,000: 45s. Last read position: /
    INFO 2018-07-24 20:08:56 SamFileValidator Validated Read 2,210,000,000 records. Elapsed time: 03:45:14s. Time for last 10,000,000: 57s. Last read position: /
    [Tue Jul 24 20:09:30 UTC 2018] picard.sam.ValidateSamFile done. Elapsed time: 225.81 minutes.
    Runtime.totalMemory()=16975396864
    To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
    Using GATK jar /gatk/gatk-package-4.0.6.0-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.0.6.0-local.jar ValidateSamFile --INPUT /cromwell_root/fc-7178e78d-efba-440a-9de1-0f0ffa4585df/f4653fc5-f4a9-4f7e-ab53-1725b639043f/PairedEndSingleSampleWorkflow/e25a51f7-139d-40b7-8a5a-e8b5a2417ef1/call-GatherBamFiles/sample_tumor.bam --OUTPUT sample_tumor.validation_.txt --MODE SUMMARY

    If I looked at the sample_tumor.validation_.txt:

    HISTOGRAM java.lang.String

    Error Type Count
    WARNING:MISSING_TAG_NM 2019110712

    So I guess that my tumor BAM file should be OK, am I right?

    Thanks,
    Bo

  • bshifawbshifaw moonMember, Broadie, Moderator
    edited July 26

    Hey @bigbadbo
    Its always best to remove all errors and warnings, that being said

    This is an alignment tag that is added by some but not all genome aligners, and is not used by the >downstream tools that we care about, so you may decide to ignore this warning by adding >IGNORE=MISSING_TAG_NM from now on when you run ValidateSamFile on this file.

    The above is qouted from a document describing how to validate SAM/BAM files.
    I'm having trouble viewing your workspace, can you confirm its shared with [email protected]

    Post edited by bshifaw on
  • KateNKateN Cambridge, MAMember, Broadie, Moderator

    @bigbadbo Hello, I just wanted to weigh in here on the sharing aspect. We are still unable to see your workspace and I think it's because you shared it with our old support group ([email protected]) rather than our new one: [email protected]. Would you go into the workspace and share it with the new address, please? Unfortunately we had to switch the name a few months past, so I completely understand the confusion.

  • bigbadbobigbadbo Member, Broadie
    Accepted Answer

    @KateN @bshifaw , thanks a lot for your help!

    With respect to the error for mutect2, I think that I fixed it by increasing the allowed disk space.

    So I'm good now. Thanks!

Sign In or Register to comment.