Undocumented use of CPU resources
been trying to familiarize myself with GATK and noted a behavior that I think is problematic. Specifically, I am trying to call variants from RNA-seq data using this guide: https://www.broadinstitute.org/gatk/guide/article?id=3891
Part of this processing chain is the GATK "module" SplitNCigarReads . According to the documentation, this module does not accept -nt or -nct arguments to increase parallelism. However, on my system it will greedly consume all CPUs it can see. For a shared environment, this is not really ok , since it will lead to oversubscription of compute resources. For example, assuming 1 CPU, I have launched a pipeline that runs 10 of these jobs in parallel on the same node - so naturally, I am seeing problems related to over-subscribed CPUs.
Is this behavior intended?