Question for FisherStrand

SeungchulSeungchul SeoulMember
edited January 2016 in Ask the GATK team


I am trying to interpret the FisherStrand provided by GATK variant annotation.

By using variant annotation tools in GATK, I made the vcf files annotated with FisherStrand.

For my detected mutation predictions (about 1100 nonsynonymous mutations in WES), I want to check their strands and remove if the mutations contain the reads with strand bias.

In documentations, I found some information as follows :

1) the higher the output value, the more likely there is to be bias.
2) FisherStrand is best suited for low coverage situations. For testing strand bias in higher coverage situations, see the StrandOddsRatio annotation.

From this points, my questions are that

1) It is obvious that mutations annotated with low output value in FisherStrand, it has no strand bias..therefore I will remove the mutations with higher FS scores. However, I couldn't determine the value of threshold for FisherStrand. the values are very vary according to each mutation. Therefore do you have any suggested threshold value for FisherStrand so that I remove the real strand biased mutation?

2) I used the whole exome sequencing data which have coverage depth of >90X on average. So FisherStrand is also suitable for my data?
Or I use other library to check the strand ratio for the accuracy? (But, I want to use FisherStrand because I already made the output by taking my a lot of time!!)

I look forward to your helpful comments.



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    If you followed GATK best practices to call variants, SOR would be annotated for you. It would be better to use that. But if you do not have it and are unwilling to redo variant calling, then you are stuck with Fisher Strand.

    We suggest a value of 60 for filtering FS but that is a very conservative value that will only filter out cases of extreme bias. To develop a filter that is more appropriate for your data, you should plot the values of FS in all of your data (not just the mutations of interest) to see what is the distribution.

  • mjtivmjtiv Newark, DEMember

    I am following up on your FS plotting suggestion. Could you explain more how this information is exactly retrieved from the VCF file, how this should be plotted and what I should be looking for? Currently, we are using a cutoff off of 30.0 and are interested in identifying ASE SNPs. The 30.0 cutoff was prior utilized in the lab on this project. Am I correct in assuming its making our filtering more stringent compared to the cutoff of 60.0 (looser). I am trying to understand/have a better explanation for what is the best FS value. In later steps we will filter further and use a binomial test to test for ASE.

    Note: the link at the bottom of this page is broken (understanding hard-filtering recommendations)

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @mjtiv,

    It appears the link was a place-holder. I've updated it and it now directs to []. Thanks for pointing that out.

    I think you would find helpful either attending a GATK workshop or going through one of the workshop hands-on tutorials on your own. I recommend the Variant Filtering and Evaluation tutorial. It will give you an opportunity to get a better feeling for various annotations, including FS. You can find the worksheet and datasets by rummaging through this google folder. Also, instructions for installing software are here.

    Most of the team is away so for more detailed advice, I hope you don't mind waiting. After studying the two resources above, if you still have questions, please ping us on this thread.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    I think this article is what you are looking for. Have a look in the Methods section which has some other articles that may interest you.


Sign In or Register to comment.