Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How to keep unique sample ID when combining gvcf files?

Hello,
I am working with RNA-seq data, and I need to get SNP calls for multiple samples (12). I first tried following the best practices method with the haplotypecaller, and later merging my VCF files. However, I realized that when I do this, any site that is not a variant in all of my samples will be marked as missing data for the non-variant samples. This is a problem because I need to know which of these samples are actually missing and which of these samples match the reference. I don't think the gVCF mode of haplotypecaller is completely supported for RNA-seq yet, but a paper that is doing similar work to mine has used it and it seemed to work well for them. Because of this, I gave it a try, but I keep coming to the same problem. When I combine my .g.vcf files, all of my samples merge. I need to make a combined vcf file with all of my sample id's remaining unique. Is there a way to do this? Thank you very much for your help and I'm sorry if this has been asked before, I have done a lot of searching but can't seem to find this question.

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @lfall
    Hi,

    The GVCF workflow for RNA-seq data has not yet been validated (as far as I know), but I think in this case, it would be worth trying it out. Just make sure to validate your results at the end :smile:

    For the sample name issue, you can simply change the sample name in the GVCF manually.

    Let us know how things work out!

    -Sheila

Sign In or Register to comment.