Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GenotypeGVCFs variant IDs
I am trying to use GenotypeGVCFs to perform joint genotyping on 16 samples. These 16 samples were sequenced twice on two different machines, so I actually have 32 readsets. I called variants for each using HaplotypeCaller, producing GVCFs and am now trying to combine these into a single multi-sample VCF, wherein the resultant multisample file will contain information for all variant loci across the cohort. However, since the samples have the same names, when I try to use GenotypeGVCFs, they are seemingly collapsed, so I only have 16 samples recorded in my output VCF. I tried specifying variant names in the format --variant:name input1.g.vcf with both GenotypeGVCFs and CombineGVCFs but had the same result - half the samples missing in the output. I know it is possible to do this using CombineVariants, but this will not take GVCF input. Is it possible to specify names for the variants when using GenotypeGVCFs?
I appreciate your help, many thanks in advance.