Combine multi-sample GVCFs
Hi GATK experts,
I have 6144 individual sample gvcfs with different ploidies so can't use GenomicsDBImport for generating a single gvcf for passing it to GenotypeGVCFs. I have tried running all 6144 gvcfs through CombineGVCFs but got stuck due to ulimit constraints which couldn't be resolved despite increasing ulimit 'nproc' and 'nofile' settings to the required higher number. This I think is due to some conflicts with SGE environment or some other arrangements in our own cluster setup. Previously I have successfully run 384 gvcfs through CombineGVCFs to the final steps. So now I have divided these 6144 gvcfs into 16 parts each containing 384 gvcfs. I am running these sixteen 384-gvcf batches through CombineGVCFs for each chromosome (12 chromosomes in total) separately. This will lead to the generation of 192 multi-sample gvcfs. My question is can CombineGVCFs be used to merge multi-sample GVCFs in addition to single sample gvcfs and, if yes, will all the annotation fields still be meaningful?