Simple explanation of MarkDuplicate

I am having a hard time understanding how MarkDuplicate works. Based on MarkDuplicate documentation, this is how it has been described: “The MarkDuplicates tool works by comparing sequences in the 5 prime positions of both reads and read-pairs in a SAM/BAM file.” I don’t understand what “5 prime positions” means in the above statement. Also, what does it mean in the context of “of both reads and read-pairs” ? If you could please explain that to me using an example I would really appreciate that.


Sign In or Register to comment.