The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

# Explanation of strand bias detected in GATK

Member Posts: 9

Hi,

I have seen the definition of strand bias on this site (below) but I need a little clarification. Does the FS filter (a) highlight instances where reads are only present on a single strand and contain a variant (as may occur toward the end of exome capture regions) or does it (b) specifically look for instances where there are reads on both strands but the variant allele is disproportionately represented on one strand (as might be indicative of a false positive), or does it (c) do both?

I had thought it did (b) but have encountered some disagreement.

** How much evidence is there for Strand Bias (the variation being seen on only the forward or only the reverse strand) in the reads? Higher SB values denote more bias (and therefore are more likely to indicate false positive calls.

Tagged:

(b) is correct. Note that Strand Bias annotation is not the same as Fisher Strand, so you cannot make assumptions from one to the other.

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

(b) is correct. Note that Strand Bias annotation is not the same as Fisher Strand, so you cannot make assumptions from one to the other.

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

• Member Posts: 9

So for the older SB, would the answer (a,b or c) be different?

Theoretically, it is (b). But we've stopped using the SB annotation because we never get goods results with it...

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

• Member Posts: 9

So (final point) FS is a filter that I can use with exome data? Again, I believed so but have encountered differing opinions.

Yes, please see our best practices documentation.

Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

• Member Posts: 9

Had done but needed the added security of a human telling me it

Thanks a lot for all the help.

That's assuming @ebanks is human

Geraldine Van der Auwera, PhD

• Member Posts: 9

Cyborgs are fine too!

• Member Posts: 10

@ebanks said:
Theoretically, it is (b). But we've stopped using the SB annotation because we never get goods results with it...

Could you describe the problems of SB more detail? Is it related to false positive or false negative? I think SB was able to eliminate a lot of strand biased SNPs. I was really surprised to see so many seemingly false SNPs called by a new GATK. I created vcfs from both a newer (without SB) and an older (with SB). It seems the newer version is calling SNPs whose bases are only supported by 1 strand, and these SNPs are near the indels.
Thank you.

• Member Posts: 17

Hello,

I to myself have been wondering how FS is the same as SB. I would actually like to filter out the SNPs that have been called in my vcf file from Unified Genotyper that are strand bias. Would this been done using the FS. If so what would be the cut off value of SNPs that would NOT be stand bias. I have searched a lot myself regarding this but I have had no luck.

Thanks,
Sinan