Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

GATK - 4.0.0.0 [BaseRecalibratorSpark low performance]

Dear GATK_team, I'd like to run Spark-enabled GATK tools on a Spark cluster. Precisely I am launching a Spark cluster in the standalone mode submitting the BaseRecalibratorSpark application via Slurm. Before the official release, I was running the gatk-4.beta.6-17 version, with the following allocated resources, and the following command line for the Spark arguments: ./gatk-launch BaseRecalibratorSpark \ --sparkRunner SPARK --sparkMaster spark://${MASTER} --driver-memory 80g --num-executors 16 --executor-memory 8g. The speed-up achieved was 3.79 min. However, with the official release GATK-4.0.0.0, with the same datafiles and the same Spark arguments I don't see the same nice speed-up anymore (~ 40 min). Am I missing something with the new version? Or with the invoking command line? Thanks in advance for your time and kind answer. Best, Giuseppe

Issue · Github
by Sheila

Issue Number
2881
State
closed
Last Updated
Assignee
Array
Closed By
chandrans

Answers

Sign In or Register to comment.