Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Deduping AFTER BQSR

Jeff_GaitherJeff_Gaither Math Biosciences Institute, Columbus, OHMember

Is there any justification for putting off the MarkDuplicates step until after you've run BQSR? Based on my knowledge of bioinformatics, this seems like a dangerous idea, but it's the one recommended in this samtools workflow:
http://www.htslib.org/workflow/
Basically, they say, run
1. RealignerTargetCreator and IndelRealigner , then
2. BaseRecalibrator and PrintReads , then
3. MarkDuplicates
I apologize for asking questions about another institute's workflow, but I feel like you'd be the folks most likely to know whether doing things in this order has any advantages. Thanks for any help you can provide.
By the way, I'm working on single-cell DNA-seq cancer data.

Best Answer

Answers

  • Jeff_GaitherJeff_Gaither Math Biosciences Institute, Columbus, OHMember

    Thanks Geraldine for that extremely thorough, informative reply. My main reason for considering the samtools workflow for this task is that MuTect2 doesn't give confidence for the REF calls, so the only way to get those is to call HaplotypeCaller on the same sites, which often leads to inconsistencies. But that's another question for another time. It's extremely helpful to know that the order of deduping and BQSR-ing is probably not an important factor.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah I see, fair enough. This is something we've been getting requests for -- the ability to get confidence for REF calls in our somatic workflows -- so I'll make sure we surface this with the developers. No guarantees but they have been making a big push on improving MuTect2 for the GATK4 release.

  • Jeff_GaitherJeff_Gaither Math Biosciences Institute, Columbus, OHMember

    Thanks, and I'm glad you're aware of the (possibly small) demand for this feature. Just FYI, I think the capability to call REFs will definitely be in the GATK4 version of MuTect2 (at least indirectly):
    http://gatkforums.broadinstitute.org/gatk/discussion/9554/outputting-ref-calls-in-mutect

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @Jeff_Gaither, I talked to the developer and he tells me you can already use that trick (from the post you link) in the current version of GATK4 Mutect2. We're planning to release GATK4 into beta very soon so I encourage you to try it out.

  • Jeff_GaitherJeff_Gaither Math Biosciences Institute, Columbus, OHMember

    Excellent @Geraldine_VdAuwera ! Thank you for checking this out, and I most definitely will.

Sign In or Register to comment.