We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

99.76% of reads failing DuplicateReadFilter for targetome analysis

nitinCelmatixnitinCelmatix NYCMember
edited February 2016 in Ask the GATK team

I am using GATK HC to identify variants in a target region of about 23kb with very deep sequencing. I get this message during HC that 99.76% of reads failing DuplicateReadFilter. This means that a lot of my reads are being thrown out and hence I am not getting correct variant calls.

First, is GATK HC an appropriate tool to call variants in such a small region with deep sequencing (more than 100X)?
Second, how can I rectify this error of 99% reads failing DuplicateReadFilter?


Best Answer


  • Thanks Shlee, this helps a lot.
    Also, is HC the only step where I should use this option "-drf DuplicateRead" or should I use in in any other step in the pipeline as well?


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    I would recommend using this option in all steps downstream of duplicate marking.

Sign In or Register to comment.