Use all samples or group samples for GenotypeGVCFs?

I have created g.vcfs for all my samples (48). Half of my samples (24) are control individuals and the other half (24) is treated individuals. Should I run GenotypeGVCFs on all my samples once to generate one raw vcf? Or twice, once for treated samples and once for control samples producing two raw vcfs? Ultimately I am looking for differences between my groups.


Best Answer


  • rmfrmf Member

    Ok. So the suggestion is that I run GenotypeGVCFs over all my samples to generate one VCF file. And then what would be the workflow to compare variants between my groups of samples?

  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭
    edited January 2018


    You can use SelectVariants to select out the case samples and control samples into separate VCFs. (The reason we recommend first running GenotypeGVCFs on all samples together is described in the article I pointed to above). Then, you can use SelectVariants with --discordance or --concordance.


