How to get best performance in Merging 6K samples using CombineGVCFs and GenotypeGVCFs (GATK 4.0.9)

I am looking for the right practices in merging 6K samples using CombineGVCFs and GenotypeGVCFs to get the best performance. I don't see Spark appended Suffix for these, does that mean these methods can't make use of spark?

I also came across some old threads on plans in bringing tileDB in 4.0 version, which could improve the performance. How can I make use of tileDB?

PS: I generated the gvcfs using HaplotypeCallerSpark in gatk4

Answers

Sign In or Register to comment.