Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
I'm trying to call genotypes on ~160 S. cerevisiae genomes by going calling. When I tried to do it on the whole genome with a single command, it would run out of memory (even with 48G provided). Now I'm doing it one chromosome at a time:
gatk-22.214.171.124/gatk --java-options "-Xmx48G" GenotypeGVCFs -R Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fasta -V combined_variants.vcf.gz -O called_genotypes_II.vcf.gz -ploidy 1 -L II
It initially appears to make progress, for the first 25 minutes, but I've had no console activity for the last 90 minutes. Looking at the CPU monitor, it just shows periodic spikes, but not sustained activity. I even tried it over night, with exactly the same behavior - i.e. 25 minutes of console updates then nothing. Chromosome I worked fine, but II never completes. Are there other similar reports?