appropriate members for generating "known-sites" list

wbsimey

I have 46 complete genomes and a good reference genome. Two of the individuals are "outgroups" (two different species). The rest are the same species as the reference genome. One of the outgroups hybridizes with the ingroup (we are studying this admixture). I have gVCF files for all individuals generated by HaplotypeCaller. When selecting and filtering variants to generate a "known-sites" list, should I exclude the outgroups? That seems like the right thing to do, but I could not think of a reason why adding the two outgroups would be a problem. Perhaps they will have unique SNPs and compromise the "known-sites" list?
Also, when creating a database of gVCFs (GenomicDBImport), should I include all individuals and then exclude individuals in the GenotypeGVCFs tool? I could not find an obvious option to exclude individuals, except perhaps --annotations-to-exclude.


