Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Separating reads in AD into different strands

gshieh3gshieh3 taiwanMember

Hello,
Is there any possibility that I can separate the reads in AD (Allele Depths by sample) into 5' and 3' strands?
e.g. the current expression of AD is REF:ALT1:ALT2 ...etc,
Can I get the expression of AD like REF_5':REF_3':ALT1_5':ALT1_3':ALT2_5':ALT2_3'

Best Answer

Answers

  • gshieh3gshieh3 taiwanMember

    Hello Sheila,
    Thanks for your kindly help!
    I saw the annotation (StrandBiasBySample) in the source code of GATK,
    but how can I get the annotation (reference-forward, reference-reverse, alternate-forward, and alternate-reverse) in the vcf files when I run the UnifiedGeotyper directly?
    Should I add certain arguments?

  • gshieh3gshieh3 taiwanMember
    edited August 2014

    Hello Sheila,
    I just saw the same problem by another users.
    http://gatkforums.broadinstitute.org/discussion/4456/strandbiasbysample-fisherstrand-annotation

    After reading your response from that, I think I should use HaplotypeCaller instead of UnifiedGenotyper.
    However, the result from that user, it seems that I only have one set of alternate-forward and alternate-reverse even thought there are multiple alleles.

    So, can i separate alternate-forward and alternate-reverse into multiple alleles read counts such as ALT1_5':ALT1_3':ALT2_5':ALT2_3'?

  • SheilaSheila Broad InstituteMember, Broadie admin

    @gshieh3‌

    Hi,

    Unfortunately, what you are asking for is what I mentioned as "inconsistencies for multi-allelic sites". The StrandBiasBySample annotation is not working properly for sites where there is more than one alternate allele present.

    I will put in a request to have it fixed.

    -Sheila

  • SheilaSheila Broad InstituteMember, Broadie admin

    @gshieh3‌

    If you can, please post the lines you get that correspond to multi-allelic sites when you run Haplotype Caller with the StrandBiasBySample annotation. This will help us with debugging, and it will verify that the issue we know of is the same issue you are having.

    Thanks,
    Sheila

  • gshieh3gshieh3 taiwanMember
    edited August 2014

    I use the line below,

    nohup /NA/jdk1.7.0_67/bin/java -jar /NA/GATK-3.2-2/GenomeAnalysisTK.jar -T HaplotypeCaller --max_alternate_alleles 8 -dcov 1000 -A FisherStrand -A AlleleBalance -A BaseCounts -A StrandBiasBySample -R /NA/gshieh3/files/human_g1k_v37.fasta -I /NA/gshieh3/GATK_Target/03-3812T/recal.bam -D /NA/gshieh3/files/dbsnp_138.b37.vcf -o /NA/gshieh3/GATK_Target/03-3812T/03_3812T.snp.indel.raw.HC.target.vcf

    and the SB information is obtained.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @gshieh3‌

    Hello,

    I have an update.

    For multi-allelic sites, only the most likely alternate allele is used for calculating SB. I do not know if that will change any time soon, because it may impact the way downstream analyses work.

    In the future, SB may include all alternate alleles, but they will be summed up, not represented as separate values.

    As for your request, we are working on a new annotation (different from SB) that will contain the counts of all alleles.

    -Sheila

  • gshieh3gshieh3 taiwanMember

    Thank you for your answer!

    I'm interested in your new annotation about counts of all alleles.
    Could you inform me once the update is released?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Best thing to do is check the release notes when the next release comes out. We'll make an announcement on the forum, the blog and Twitter (@gatk_dev) when that happens.

Sign In or Register to comment.