Picard MarkDuplicates with UMI

jonathan.barlevjonathan.barlev Israel - INCPMMember
  1. In MarkDuplicates, how is the newly added UMI option specified? The documentation (https://broadinstitute.github.io/picard/command-line-overview.html ) refers to a BARCODE_TAG string option; should this value be the name of the bam tag used for storing the umi (e.g., if the umi tag is of the form bc:Z:ACTG, then this option should be set to "bc")?

  2. Is there a standard bam tag name for the UMI barcode (BX?). It is my understanding "BC" (and the corresponding "QT" tag) refers to the sample barcode, is this correct

Best Answer


  • jonathan.barlevjonathan.barlev Israel - INCPMMember

    Thanks, what's the standard quality tag corresponding to "BX"?

  • yfarjounyfarjoun Broad InstituteDev ✭✭✭
  • yfarjounyfarjoun Broad InstituteDev ✭✭✭

    actually, let me fix that first answer RX is the standard tag for un-corrected UMI and BX for "corrected" umi (if you have error correction)

  • jonathan.barlevjonathan.barlev Israel - INCPMMember

    Am I correct that MarkDuplicates will not omit any read-level duplication rate statistics? E.g., a tag with the number of duplicates each read has.

    We use this number for determining when a sequencing error in the UMI barcode may have occurred. It would be great to have a "Number of Duplicates" read tag inserted (e.g., ND:i:7).

  • deklingdekling Broad InstituteMember admin

    @jonathan.barlev. Thank you for your patience. We will answer your question shortly.

  • deklingdekling Broad InstituteMember admin
    edited May 2016

    @jonathan.barlev. Thanks for your patience. Your suggestion is a good one. However, there are currently no plans at the moment to add this tag. If it is absolutely essential to your work, you are welcome to write a program to create this tag and submit a pull request. Thank you again and I hope this helps.

