Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

[GATK 4.0.1.2] unsorted bam in mutect pipeline running in tumor-only mode with make_bamout = true

Hello,

I need your help on using Mutect2. I tried to call somatic mutations using mutect2_multi_sample.wdl in tumor-only mode with make_bamout = true and scatter_count = 50.
It was failed for a small fraction of samples during the task, MergeBamOuts. It appears that, for those samples, a gathered bam file was not sorted and it failed to index the unsorted bam file.

Below is from stderr of a failed sample. Thank you!

[Mon Feb 19 22:57:09 CST 2018] picard.sam.GatherBamFiles done. Elapsed time: 0.29 minutes.
Runtime.totalMemory()=1533542400
Using GATK jar mutect2-gatk4.0.1.2/wgs/tumor-only/cromwell-executions/Mutect2_Multi/5a39fcf2-c68d-4a03-b868-648c0e307351/call-Mutect2/shard-68/Mutect2/ccb28ebf-07b7-40e6-b9ef-0ae57a399b36/call-MergeBamOuts/inputs/software/GATK/gatk-4.0.1.2/gatk-package-4.0.1.2-local.jar defined in environment variable GATK_LOCAL_JAR
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Xmx6000m -jar gatk-package-4.0.1.2-local.jar GatherBamFiles -I {shard-0.bamout} -I {shard-1.bamout} ... -I {shard-49.bamout} -O unfiltered.vcf.gz.out.bam
-R reference.fasta
[bam_index_core] the alignment is not sorted (K00000:00:H50000000:1:1105:27438:36024): 4613952 > 4613923 in 7-th chr
[bam_index_build2] fail to index the BAM file.

Best Answers

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @dayzcool
    Hi,

    Can you confirm this happens if you try running again? I need to check with the team if they are aware of this, but I have seen that sometimes re-running helps the issue go away :smile:

    -Sheila

  • dayzcooldayzcool Member

    Hi @Sheila,
    Thank you for looking into it. I re-ran Mutect2 and got the error for same tumors.
    It occurred for 3 tumors out of ~50 tumors.

  • dayzcooldayzcool Member

    Thank you, @Sheila! I'll report back. (though not a big fan of homework)

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    haha @dayzcool We are pretty sure you are savvy enough to handle it :smiley: Thanks for looking into it!

  • dayzcooldayzcool Member

    Hi @Sheila, thank YOU for the kind help. Here's what I found.
    First, manual sorting and indexing works.
    Second, I did the homework. :wink: For context, I got this error from a different sample.

    [bam_index_core] the alignment is not sorted (HWI-D00008:700:C000000XX:1:1306:4790:41021): 66527505 > 66527490 in 7-th chr
    

    I was able to find the read in the error message and confirmed that the assembly regions are overlapped. (7-th chr was chr7).

    $ cat 0020-scattered.intervals
    7       4613934 66527481        +       .
    $ cat 0021-scattered.intervals
    7       66527482        128441029       +       .
    

    In addition, I wonder if variants can be overlapped in case assembly regions are overlapped. (I guess not?)

    Take care!

  • dayzcooldayzcool Member

    @shlee, thanks for the suggestion! I don't think I understand the consequence of splitting an contiguous interval fully. I'd prefer not to take risks given most my projects are not critically time sensitive.

  • davidbendavidben BostonMember, Broadie, Dev ✭✭

    The fix has been merged and will go into the next release, which will be in a week or so. Thank you @dayzcool and @shlee!

  • dayzcooldayzcool Member

    Thank YOU, @davidben! One anecdotal observation is that explicitly sorted bam files were considerably smaller in the experiment you suggested. It might be preferable even without the issue if stored for long term.

  • shleeshlee CambridgeMember, Broadie, Moderator admin
Sign In or Register to comment.