Explanation of strand bias detected in GATK

Hi,

I have seen the definition of strand bias on this site (below) but I need a little clarification. Does the FS filter (a) highlight instances where reads are only present on a single strand and contain a variant (as may occur toward the end of exome capture regions) or does it (b) specifically look for instances where there are reads on both strands but the variant allele is disproportionately represented on one strand (as might be indicative of a false positive), or does it (c) do both?

I had thought it did (b) but have encountered some disagreement.

** How much evidence is there for Strand Bias (the variation being seen on only the forward or only the reverse strand) in the reads? Higher SB values denote more bias (and therefore are more likely to indicate false positive calls.

Best Answer

  • ebanksebanks Broad InstitutePosts: 687Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin
    Answer ✓

    (b) is correct. Note that Strand Bias annotation is not the same as Fisher Strand, so you cannot make assumptions from one to the other.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

Answers

  • ebanksebanks Broad InstitutePosts: 687Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin
    Answer ✓

    (b) is correct. Note that Strand Bias annotation is not the same as Fisher Strand, so you cannot make assumptions from one to the other.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • travisbickle34travisbickle34 Posts: 9Member

    So for the older SB, would the answer (a,b or c) be different?

  • ebanksebanks Broad InstitutePosts: 687Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    Theoretically, it is (b). But we've stopped using the SB annotation because we never get goods results with it...

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • travisbickle34travisbickle34 Posts: 9Member

    So (final point) FS is a filter that I can use with exome data? Again, I believed so but have encountered differing opinions.

  • ebanksebanks Broad InstitutePosts: 687Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    Yes, please see our best practices documentation.

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • travisbickle34travisbickle34 Posts: 9Member

    Had done but needed the added security of a human telling me it :)

    Thanks a lot for all the help.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,056Administrator, GATK Dev admin

    That's assuming @ebanks is human :)

    Geraldine Van der Auwera, PhD

  • travisbickle34travisbickle34 Posts: 9Member

    Cyborgs are fine too!

  • HideoHideo Posts: 10Member

    @ebanks said:
    Theoretically, it is (b). But we've stopped using the SB annotation because we never get goods results with it...

    Could you describe the problems of SB more detail? Is it related to false positive or false negative? I think SB was able to eliminate a lot of strand biased SNPs. I was really surprised to see so many seemingly false SNPs called by a new GATK. I created vcfs from both a newer (without SB) and an older (with SB). It seems the newer version is calling SNPs whose bases are only supported by 1 strand, and these SNPs are near the indels.
    Thank you.

  • sir2013sir2013 Posts: 17Member

    Hello,

    I to myself have been wondering how FS is the same as SB. I would actually like to filter out the SNPs that have been called in my vcf file from Unified Genotyper that are strand bias. Would this been done using the FS. If so what would be the cut off value of SNPs that would NOT be stand bias. I have searched a lot myself regarding this but I have had no luck.

    Thanks,
    Sinan

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,056Administrator, GATK Dev admin

    Hi @sir2013,

    For now the only documentation we have on this is here:
    http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_annotator_FisherStrand.html

    In the near future we're going to try to add more information to explain what annotations mean and how they are calculated, but right now we don't have any bandwidth to go into the details, sorry.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.