Advice for running GenotypeVCF and Recalibration on a thousand samples

shubhamsainishubhamsaini UCSDMember
edited April 2017


I am working with a couple thousand gVCFs. I plan to run GenotypeGVCF and ApplyRecalibration on them in a joint fashion. I want to know how does GATK scale on these numbers, and is it even recommended to run this analysis on these numbers?

I tried running GenotypeGVCF in multi-threaded mode, but it runs into race condition (or some other issue) and a single thread seems the only viable option.


