To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Effect of using a ReduceReads bam or full bam on GATK 2.7 HaplotyeCaller+VQSR?

What is the expected effect of using GATK 2.7 HaplotyeCaller+VQSR on a WGS 30x bam or the same bam being processed through ReducedReads beforehand? Does one expect exactly the same variants called on both files or a small difference between them?
Do HaplotypeCaller or VQSR treat the input differently if it comes from a full WGS bam or a WGS reduced bam?

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    The calls won't be identical; you may see some marginal differences in annotation values, and perhaps some presence/absence of borderline calls that would be filtered out anyway. These effects are due to downsampling and can safely be ignored.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hi there,

    The ReduceReads tool is designed to not have any effect on the variant calls that are made from reduced data. The callers themselves do not distinguish between reduced and unreduced data.

    Note that for simple-sample calling it is generally not necessary to reduce the data, so if you're only processing one sample you can save time by skipping RR.

  • So does one expect exactly the same variants called on both the original WGS 30x file and the ReducedReads WGS 30x file or could there be small difference between them?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    Accepted Answer

    The calls won't be identical; you may see some marginal differences in annotation values, and perhaps some presence/absence of borderline calls that would be filtered out anyway. These effects are due to downsampling and can safely be ignored.

Sign In or Register to comment.