Memory error when using scatter-gather (haplotypecaller + GenotypeGVCFs) on 100 WES samples?

theomarkertheomarker qingdaoMember
edited March 2018 in Ask the GATK team

Dear GATK team,
I have prepared 100 WES processed BAMs and try to call variants and output them in a single VCF. I used scatter-gather WDL scripts following https://software.broadinstitute.org/wdl/documentation/article?id=7614 .
I submit the job to one node which has 40 cores. I tried more cores (from 2 or more nodes), but it seems haplotypecaller only runs on one node. One node should have about 126G memory.
The problem is the job seems need too much memory and the node is unavailable to access.
Does the scatter-gather needs a lot memory ?

Part of my log file:
2018-03-28 21:20:13,921 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:11:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,921 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:53:1]: job id: 4800
2018-03-28 21:20:13,922 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:63:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,923 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:25:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,924 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:53:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,945 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:84:1]: job id: 4937
2018-03-28 21:20:13,955 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:84:1]: Status change from - to WaitingForReturnCodeFile

stderr of one shard-0/execution:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/work/home/jiahuan/Heart_WES/HTX/cromwell-executions/jointCallingGenotypes/d71d6520-d369-43c8-bfce-f0521358f8e9/call-HaplotypeCallerERC/shard-0/execution/tmp.2jWDLA

note: GATK3.7

Thanks!

Post edited by theomarker on

Best Answer

  • theomarkertheomarker qingdao
    Accepted Answer

    @Sheila said:
    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

    Hi, @Sheila, thanks a lot for your reply. I just figure out the problem by setting a config file for cromwell as well as adding runtime {} for the wdl script. I set the memory and cpu for each of the job in the scatter step. Besides, the cocurrent-job-limit in the cromwell config file can also help to limit the parallel jobs to ensure that every job has enough memory. Moreover, I add bsub command in my cromwell config file since I used the HPC LSF cluster.

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

  • theomarkertheomarker qingdaoMember
    Accepted Answer

    @Sheila said:
    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

    Hi, @Sheila, thanks a lot for your reply. I just figure out the problem by setting a config file for cromwell as well as adding runtime {} for the wdl script. I set the memory and cpu for each of the job in the scatter step. Besides, the cocurrent-job-limit in the cromwell config file can also help to limit the parallel jobs to ensure that every job has enough memory. Moreover, I add bsub command in my cromwell config file since I used the HPC LSF cluster.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @theomarker
    Hi,

    Thanks for reporting back.

    -Sheila

Sign In or Register to comment.