Effect of using a ReduceReads bam or full bam on GATK 2.7 HaplotyeCaller+VQSR?

avilellaavilella Posts: 6Member

What is the expected effect of using GATK 2.7 HaplotyeCaller+VQSR on a WGS 30x bam or the same bam being processed through ReducedReads beforehand? Does one expect exactly the same variants called on both files or a small difference between them?
Do HaplotypeCaller or VQSR treat the input differently if it comes from a full WGS bam or a WGS reduced bam?

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,270Administrator, GATK Dev admin
    Answer ✓

    The calls won't be identical; you may see some marginal differences in annotation values, and perhaps some presence/absence of borderline calls that would be filtered out anyway. These effects are due to downsampling and can safely be ignored.

    Geraldine Van der Auwera, PhD

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,270Administrator, GATK Dev admin

    Hi there,

    The ReduceReads tool is designed to not have any effect on the variant calls that are made from reduced data. The callers themselves do not distinguish between reduced and unreduced data.

    Note that for simple-sample calling it is generally not necessary to reduce the data, so if you're only processing one sample you can save time by skipping RR.

    Geraldine Van der Auwera, PhD

  • avilellaavilella Posts: 6Member

    So does one expect exactly the same variants called on both the original WGS 30x file and the ReducedReads WGS 30x file or could there be small difference between them?

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,270Administrator, GATK Dev admin
    Answer ✓

    The calls won't be identical; you may see some marginal differences in annotation values, and perhaps some presence/absence of borderline calls that would be filtered out anyway. These effects are due to downsampling and can safely be ignored.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.