GATK4 joint Genotyping for an exome pipeline: CombineGVCFs or GenomicsDBImport ?
I want to use 386 exomes as a normalization group for joint genotyping for an exome diagnostic pipeline. Usually it was done with a “giant combined gvcf” splitted per chromosome but I wanted to give GenomicsDBImport a try.
So I did and I’m quite disappointed. I think I’m might doing something wrong or maybe GenomicsDBImport is not yet suited yet for my purpose. So I have some questions.
The building of a GenomicsDBImport is longer than a traditional CombineGVCFs per chromosome. It wouldn’t be a problem if I could build it “forever” and then give the database plus the patient samples .gvcfs to process to GenotypeGVCFs or add new samples to the database. Do you plan adding this feature?
Because you can’t add a new simple in an already built GenomicsDB, I should rebuilt it with the new samples at every single pipeline execution. So I don’t see why use this GenomicsDB or perhaps should I use the Intel library? It seems to add an unwanted supplementary level of complexity which I don’t know if it is worth it or not.
Am I missing something?