Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

filter for strand bias in stranded RNAseq?

Hello,

I was wondering if it makes sense to filter for strand bias as stated in the Best Practice RNAseq Variant Calling guide as most of todays RNAseq data is strand specific. I would actually expect high strand biases of variants and be suspicious about variants which do NOT show strand bias =)
...or did i get something wrong with the Fisher Strand values?

Thank you

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @michel
    Hi,

    You are right that the RNA molecules themselves are produced in a strand specific manner. However, you then generate cDNA for sequencing. The cDNA is sequenced from both strands; hence you expect no strand bias in the sequencing.

    -Sheila

  • AndresRiboneAndresRibone Member

    @Sheila said:
    @michel
    Hi,

    You are right that the RNA molecules themselves are produced in a strand specific manner. However, you then generate cDNA for sequencing. The cDNA is sequenced from both strands; hence you expect no strand bias in the sequencing.

    -Sheila

    What about single end data?
    We have these RNAseq samples (Illumina stranded 2x100) which ~70% of reads overlapped so we decided to fuse the mates creating single end reads. (We used FLASH). The new reads had better mapping metrics. ¿Should I disable the FisherStrand filter?

  • SheilaSheila Broad InstituteMember, Broadie admin

    @AndresRibone
    Hi,

    ~70% of reads overlapped so we decided to fuse the mates creating single end reads

    Can you explain this a bit more? What do you mean by 70% of reads overlapped? Is that expected?

    Did you generate variants with HaplotypeCaller before fusing and after fusing? Can you post some example records that contain the FS annotation?

    Thanks,
    Sheila

  • AndresRiboneAndresRibone Member
    edited June 2018

    @Sheila
    Hi,
    Apparently, 70 % of the cDNA fragments (from which the paired reads were sequenced) were smaller than 200 bases. That's the only explanation on why the paired reads overlapped.

    Original fragment:
    5'-ATCGTGCATCTAGCTTAGCTAGCTCGTAGCTGTGCGATCGATCAGCTAGTAACCG-3'

    Reads (stranded):
    5'-ATCGTGCATCTAGCTTAGCTAGCTCGTAGCTGTGCGA-3'
    ...............3'-ATCGATCGAGCATCGACACGCTAGCTAGTCGATCATTGGC-5'

    "Synthetic" new stranded single end read:
    5'-ATCGTGCATCTAGCTTAGCTAGCTCGTAGCTGTGCGATCGATCAGCTAGTAACCG-3'

    By doing this merging, I got better mapping with STAR. So, all the GATK things I did so far were using the "mostly single end reads" alignment.

    Sorry, what is the FS annotation?

    Thanks in advance!

  • SheilaSheila Broad InstituteMember, Broadie admin

    @AndresRibone
    Hi,

    I see. Okay, well if the mapping is better with the fused reads, it may be best to stick with those.

    You can read more about FisherStrand here.

    -Sheila

Sign In or Register to comment.