Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Memory error when using scatter-gather (haplotypecaller + GenotypeGVCFs) on 100 WES samples?

theomarkertheomarker qingdaoMember
edited March 2018 in Ask the GATK team

Dear GATK team,
I have prepared 100 WES processed BAMs and try to call variants and output them in a single VCF. I used scatter-gather WDL scripts following https://software.broadinstitute.org/wdl/documentation/article?id=7614 .
I submit the job to one node which has 40 cores. I tried more cores (from 2 or more nodes), but it seems haplotypecaller only runs on one node. One node should have about 126G memory.
The problem is the job seems need too much memory and the node is unavailable to access.
Does the scatter-gather needs a lot memory ?

Part of my log file:
2018-03-28 21:20:13,921 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:11:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,921 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:53:1]: job id: 4800
2018-03-28 21:20:13,922 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:63:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,923 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:25:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,924 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:53:1]: Status change from - to WaitingForReturnCodeFile
2018-03-28 21:20:13,945 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:84:1]: job id: 4937
2018-03-28 21:20:13,955 INFO - BackgroundConfigAsyncJobExecutionActor [UUID(d71d6520)jointCallingGenotypes.HaplotypeCallerERC:84:1]: Status change from - to WaitingForReturnCodeFile

stderr of one shard-0/execution:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/work/home/jiahuan/Heart_WES/HTX/cromwell-executions/jointCallingGenotypes/d71d6520-d369-43c8-bfce-f0521358f8e9/call-HaplotypeCallerERC/shard-0/execution/tmp.2jWDLA

note: GATK3.7

Thanks!

Post edited by theomarker on

Best Answer

  • theomarkertheomarker qingdao
    Accepted Answer

    @Sheila said:
    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

    Hi, @Sheila, thanks a lot for your reply. I just figure out the problem by setting a config file for cromwell as well as adding runtime {} for the wdl script. I set the memory and cpu for each of the job in the scatter step. Besides, the cocurrent-job-limit in the cromwell config file can also help to limit the parallel jobs to ensure that every job has enough memory. Moreover, I add bsub command in my cromwell config file since I used the HPC LSF cluster.

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

  • theomarkertheomarker qingdaoMember
    Accepted Answer

    @Sheila said:
    @theomarker
    Hi,

    At first I thought this may be related to this issue, but I see you are using version 3. It may still be worth trying adding -newQual to your command.

    -Sheila

    Hi, @Sheila, thanks a lot for your reply. I just figure out the problem by setting a config file for cromwell as well as adding runtime {} for the wdl script. I set the memory and cpu for each of the job in the scatter step. Besides, the cocurrent-job-limit in the cromwell config file can also help to limit the parallel jobs to ensure that every job has enough memory. Moreover, I add bsub command in my cromwell config file since I used the HPC LSF cluster.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @theomarker
    Hi,

    Thanks for reporting back.

    -Sheila

Sign In or Register to comment.