Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Use all samples or group samples for GenotypeGVCFs?

I have created g.vcfs for all my samples (48). Half of my samples (24) are control individuals and the other half (24) is treated individuals. Should I run GenotypeGVCFs on all my samples once to generate one raw vcf? Or twice, once for treated samples and once for control samples producing two raw vcfs? Ultimately I am looking for differences between my groups.

Tagged:

Best Answer

Answers

  • rmfrmf Member

    Ok. So the suggestion is that I run GenotypeGVCFs over all my samples to generate one VCF file. And then what would be the workflow to compare variants between my groups of samples?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited January 2018

    @rmf
    Hi,

    You can use SelectVariants to select out the case samples and control samples into separate VCFs. (The reason we recommend first running GenotypeGVCFs on all samples together is described in the article I pointed to above). Then, you can use SelectVariants with --discordance or --concordance.

    -Sheila

Sign In or Register to comment.