This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Undocumented use of CPU resources
been trying to familiarize myself with GATK and noted a behavior that I think is problematic. Specifically, I am trying to call variants from RNA-seq data using this guide: https://www.broadinstitute.org/gatk/guide/article?id=3891
Part of this processing chain is the GATK "module" SplitNCigarReads . According to the documentation, this module does not accept -nt or -nct arguments to increase parallelism. However, on my system it will greedly consume all CPUs it can see. For a shared environment, this is not really ok , since it will lead to oversubscription of compute resources. For example, assuming 1 CPU, I have launched a pipeline that runs 10 of these jobs in parallel on the same node - so naturally, I am seeing problems related to over-subscribed CPUs.
Is this behavior intended?