We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

What causes BaseRecalibratorSpark to run for a long time and end up failing with memory errors?

LuobinLuobin Idaho State UniversityMember

Hi, GATK team,

I am testing BaseRecalibrator in GATK 4.5 beta, when running in LOCAL mode, it finishes pretty fast. However when i run BaseRecalibratorSpark in SPARK mode, it runs for a long time and eventually fails with memory errors like:

'java.lang.OutOfMemoryError:GC overhead limit exceeded'

When I look at the stdout of the executors, it contains many messages like this:

14:17:19.753 INFO KnownSitesCache - Number of variants read: 37000001

I tested HaplotypeCallerSpark on the same SPARK cluster and it can finish pretty quick too.

Issue · Github
by Sheila

Issue Number
2593
State
open
Last Updated
Assignee
Array
Milestone
Array

Answers

Sign In or Register to comment.