Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

GATK - [BaseRecalibratorSpark low performance]

Dear GATK_team, I'd like to run Spark-enabled GATK tools on a Spark cluster. Precisely I am launching a Spark cluster in the standalone mode submitting the BaseRecalibratorSpark application via Slurm. Before the official release, I was running the gatk-4.beta.6-17 version, with the following allocated resources, and the following command line for the Spark arguments: ./gatk-launch BaseRecalibratorSpark \ --sparkRunner SPARK --sparkMaster spark://${MASTER} --driver-memory 80g --num-executors 16 --executor-memory 8g. The speed-up achieved was 3.79 min. However, with the official release GATK-, with the same datafiles and the same Spark arguments I don't see the same nice speed-up anymore (~ 40 min). Am I missing something with the new version? Or with the invoking command line? Thanks in advance for your time and kind answer. Best, Giuseppe

Issue · Github
by Sheila

Issue Number
Last Updated
Closed By


Sign In or Register to comment.