This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
speed up variant detection by splitting genome into chromosomes
I have a a really deep (150x coverage) data for which I need to perform variant detection. Which of the two options is more effective to speed up the variant detection:
1. I run the whole data in one go and use -nt and -nct options wherever possible.
2. Or, I split up the genome bam files into 3 or 4 sets of chromosomes and then run them in parallel (with lower number of -nt and -nct).
If I go with option 2, can I merge the vcf files from all parallel runs (from different chromosomes) right after running HaplotypeCaller? Is that what is recommended to make sure that I dont have too small of a variant set necessary for recalibration (which is the issue I am facing right now)?