We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

MarkDuplicates policy on reads that are unaligned/unmapped?

Our bams are created according to the "lossless" alignment procedure described in this article. The procedure involves mixing unaligned and aligned reads with Picard's MergeBamAlignment. So they contain both mapped and unmapped reads. These bams are then sorted with SortSam - so that the sort order in the header becomes:

@HD VN:1.6 SO:coordinate

On such bams, is there any special sort order that should be specified with MarkDuplicates to reduce memory usage, or speedup processing? Can you recommend --ASSUME_SORT_ORDER X ? It's not clear from the documentation how MarkDuplicates handles reads that don't have reliable position information in the bam.


Sign In or Register to comment.