Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Accelerate HaplotypeCaller step
I am using GATK in a clinical context for NGS diagnosis. The issue is that the HaplotypeCaller take some time, too much time actually (2h per patient).
I tried this things :
- reduce the bam file size by keeping only the genomic regions of my diagnosis genes but it looks like it still run all the hg19 genome.
- ask "only variants" with the output_mode option but the output file is exactly the same than the default one.
- use several CPU thread, but 1 CPU = 147 min, 2 CPU = 89 min, 3 CPU = 80 min. And I don't have this much CPU available so it is not interesting above 2 CPU , and still not fast enough.
I can't use the data thread option right now, would it allow me to gain more time than the CPU option ?
There is the interval option but I don't think it would allow me to gain enough time since I have gene of interest on almost all chromosomes.
I would appreciate to have your guidance regarding this problem. How would you do to make this HaplotypeCaller step faster ?
Many thanks in advance.