BWA mem -M option

bwa mem has an -M flag that will:

Mark shorter split hits as secondary (for Picard compatibility).

However, my guess is Picard has since been updated and this is no longer required. Should bwa mem be run with or without the -M flag assuming we are using relatively up to date software?

Best Answer

Answers

  • danielecookdanielecook Member
    edited December 2018

    Thank you very much

  • danielecookdanielecook Member
    edited December 2018

    One other question @bhanuGandham. Would using the -M flag interfere with MarkDuplicates functionality in terms of marking duplicates?

    Thanks,
    Dan

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @danielecook

    I do not believe that the-M option interferes with MarkDuplicates functionality. -M option is in fact used to facilitate the working of it. MarkDuplicates does not work with split alignments and hence -M option to flag them as secondary.
    Hope this helps.

    Regards
    Bhanu

  • naokoffnaokoff Member
    Hello! I am a little confused about this topic.
    I am trying to process my paired-end WGS data of non-human (birds) with GATK4 on our local server.
    (I am completely new to GATK.)
    I have uBAM files generated with FstqToSam and MarkIlluminaAdapters.

    Pipelines like processing-for-variant-discovery-gatk4.wdl and PairedEndSingleSampleWf.wdl use BWA men options like "-K 100000000 -p -v 3 -t 16 -Y ", but don' use -M option, though they go through MarkDuplicates in their downstreams.
    Why they don't need it?

    Besides, I couldn't find -K option in BWA manual.
    Could you explain why these new options are proper?
    or Is this new set of options proper for my case?
  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    HI @naokoff

    The -K option is actually an added parameter because SamToFastq tool in GATK is piping from one process to another. Piping is basically program 1 does a thing | program 2 takes the input from program 1 and does a thing. The "|" symbol i the pipe.

    The -K setting in this case lets the second process know how many lines to accept at one time, to reduce overloading the memory and accidentally skipping or overwriting part of the file.

    The -M option does not need to be set in this pipeline because they are the AlignmentPipeline.wdl is also marking the duplicates. You can see the code here

    So, the -K option is set internally to speed up the pipeline so it does not need to be set by the user.

Sign In or Register to comment.