Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

SNP strand bias

Hi,

I'm currently trying to filter out SNPs which are not supported by forward and reverse strand. Therefore I checked the provided FS value and used a cutoff of 60 to exclude biased SNPs.

However, it seems the cutoff is a bit too high since I still got a SNP with the mutated allele beeing supported by 7 forward and 0 reverse reads. The FS value is around 40.

So my question is, is 60 really a reasonable cutoff to use ? And is there any option to get the read counts which are used to calculate the FS into the VCF output ?

Thanks !

Best Answer

Answers

  • MaxMax Member

    Hi Geraldine, thanks for your answer! I will try to experiment with the values,

    Yes, allele depth including the strand information. So for example, mutated allele is supported by 4 reads on + and by 18 reads on the - strand.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I see, so basically have a secondary strandedness-based stratification of allele depth. I see the point, but that could quickly get cumbersome in the presence of multiple alleles. Hmm. Well it's not on our current roadmap, but we'd certainly be happy to look at a patch if someone were to implement that as a new annotation.

  • MaxMax Member

    Thats true, however one could also take something like the Samtools DP4 value, which would be Number of 1) forward ref alleles; 2) reverse ref; 3) forward non-ref; 4) reverse non-ref alleles

    Then it would be limited to those 4 values and could help if there is only a single non-ref allele.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Fair point, that would definitely help prevent ballooning annotations. Like I said, we're happy to look at a patch if anyone wants to implement this.

Sign In or Register to comment.