Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Rounds of BQSR in GATK3 GenericPreProcessingWorkflow

I reviewed the GATK3 WDL Preprocessing workflow on gitHub, and found something strange.

I recall in GATK3, we'd need to run BQSR twice, then use PrintReads to apply the results of BQSR. However, it seemed the code only runs BQSR once before running PrintReads. Can anyone confirm that we actually need to run it twice? Thanks!

Best Answer

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    As far as I remember you need to run BQSR multiple rounds only if you do not have a clear set of known variants for the genome that you are working with. For human samples we clearly have those datasets so we only run once to recalibrate and print.

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @johnma Does @SkyWarrior's comment help clear up the issue?

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin

    @johnma

    This github issue on gatk4 also gives a bit of information about the double-running.

    So, running BaseRecalibrator a second time is not necessary if the user is only interested in getting the bam with recalibrated bases, but is needed if they want to create before vs after plots to see how well the recalibration performed.

Sign In or Register to comment.