How to get best performance in Merging 6K samples using CombineGVCFs and GenotypeGVCFs (GATK 4.0.9)
I am looking for the right practices in merging 6K samples using CombineGVCFs and GenotypeGVCFs to get the best performance. I don't see Spark appended Suffix for these, does that mean these methods can't make use of spark?
I also came across some old threads on plans in bringing tileDB in 4.0 version, which could improve the performance. How can I make use of tileDB?
PS: I generated the gvcfs using HaplotypeCallerSpark in gatk4