GATK - 4.0.0.0 [BaseRecalibratorSpark low performance]

Dear GATK_team, I'd like to run Spark-enabled GATK tools on a Spark cluster. Precisely I am launching a Spark cluster in the standalone mode submitting the BaseRecalibratorSpark application via Slurm. Before the official release, I was running the gatk-4.beta.6-17 version, with the following allocated resources, and the following command line for the Spark arguments: ./gatk-launch BaseRecalibratorSpark \ --sparkRunner SPARK --sparkMaster spark://${MASTER} --driver-memory 80g --num-executors 16 --executor-memory 8g. The speed-up achieved was 3.79 min. However, with the official release GATK-4.0.0.0, with the same datafiles and the same Spark arguments I don't see the same nice speed-up anymore (~ 40 min). Am I missing something with the new version? Or with the invoking command line? Thanks in advance for your time and kind answer. Best, Giuseppe

Issue · Github
by Sheila

Issue Number
2881
State
closed
Last Updated
Assignee
Array
Closed By
chandrans

Answers

Sign In or Register to comment.