Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Bait bias

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
edited March 2016 in Dictionary

Bait bias (single bait bias or reference bias artifact) is a type of artifact that affects data generated through hybrid selection methods.

These artifacts occur during or after the target selection step, and correlate with substitution rates that are biased or higher for sites having one base on the reference/positive strand relative to sites having the complementary base on that strand. For example, a G>T artifact during the target selection step might result in a higher (G>T)/(C>A) substitution rate at sites with a G on the positive strand (and C on the negative), relative to sites with the flip (C positive)/(G negative). This is known as the "G-Ref" artifact.

Post edited by dekling on

Comments

  • joneskm4joneskm4 NCIMember

    I think we are seeing this type of artifact in some of our exome sequencing data, but I'm a little unclear on what the actual cause is. What causes the substitution during the target selection step?

    Issue · Github
    by Sheila

    Issue Number
    870
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    dekling
  • deklingdekling Broad InstituteMember admin

    Don't hold my feet to the fire on this, but I believe the errors are introduced to the bait from sample handling. Essentially, guanines on the bait sequence are sensitive to oxidation from extraction agents, heat, etc. This can cause some guanine nucleotides to become 8-oxoguanine (8-OxoG, OxoG) nucleotides. These modified guanines can basepair with T instead of C as would normally be expected. Thus, during PCR, this error is propagated. Since the G is sensitive to oxidation, you will likely see a higher frequency of G ->A then C->T. Is this helpful?

  • aerijmanaerijman Member
    Are "bait-bias artifacts" substitutions attributed to have happened to the PROBES that were used to fish out or ENRICH the DNA sample?
  • joneskm4joneskm4 NCIMember

    Circling back around on this because we are seeing this happen again, and I don't feel like I ever got a clear answer on what causes this. And I can't find much in the literature about it. Is the "G-ref" artifact caused by damage to the capture probes/baits?? The description says it can happen "during or after the target selection step". And that a "G>T artifact during the target selection step" can cause it. But that doesn't really explain in my mind when and how the artifact is being introduced.

  • joneskm4joneskm4 NCIMember

    And I should clarify-what we are seeing are definitely G>T artifacts, not G>A/C>T, which are OxoG artifacts. Picard is flagging these samples as having low qscores for baitbias/G>T changes, too. So I think we are seeing this artifact, I just don't understand the origin of the artifact.

  • joneskm4joneskm4 NCIMember

    After reading the pre-adapter bias documentation, it looks like oxidative damage can show up as G>T or C>A changes as well.

    So, how do you know if elevated G>T rates are OxoG artifacts, or G-ref artifacts, and what's the difference?

Sign In or Register to comment.