If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Extracting consensus variants from a VCF with 27 RNA-seq samples from the same genotype
Is there a tool, or recommended best practice for generating a consensus set of variants from multiple samples of the same genotype? In short I have 27 RNA libraries from different individuals and different tissues, and different sequencing lanes, but all from the same genotype, and I analyzed them following the RNA best practices listed and using the gVCF/HaplotypeCaller (I understand this is unsupported, but it seemed the most appropriate). Then end result is a VCF with 27 “columns” for each SNP, one for each sample (for instance root_1, root_2, leaf_1, leaf_2, etc). I would like to generate a VCF with a single column, combining the information for all the samples. Based on the website descriptions, it seems like CombineVariants is not appropriate, and I cannot see a way to do it with SelectVariants. It is perhaps complex as, for a given SNP, different samples, although from the same genotype, may have different alleles, as they are from different individuals – I would prefer to select the most common variant if possible. My downstream goal is to generate a new reference genome for the genotype that all of the 27 samples are derived form.