Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GenomicsDBImport too slow on local server

dbeckerdbecker MunichMember ✭✭✭

Hi,

I tried using GenomicsDBImport for our data. In my testcase I tried importing Chromosome 1 for 223 samples. Since most samples are panels and we have only a few genomes and exomes, I thought it would be best to always call anything together.
My commandline:

opt/gatk/4.0.0.0/gatk --java-options "-Xmx8G -Xms8G" GenomicsDBImport
 --sample-name-map[...]/all_samples.sample_map 
--genomicsdb-workspace-path [...]/germline_snp_database_1
 --batch-size 50 
-L NC_000001 
--reader-threads 5

I only use 5 reader threads because I plan on parallelizing with scatter gather later on. The command is running since 14 hours on a local server. Is there something wrong, or something I can do to mae it reasonable fast? So far the GATK 3.8 pipeline is way faster.

Thanks & best regards,
Daniel

Best Answer

Answers

Sign In or Register to comment.