Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Can you give some suggestions on running gatk4-germline-joint-discovery pipeline on 799 human WES sa

Hi, GATK team.
I have a cohort of 799 human WES samples, and have generated the g.vcf file. Now I want to run the germline-joint-discovery pipeline here.

As my local cluster is running grid engine, so I run the cromwell using this sge.conf file below. The cmd is java -Dconfig.file=$cromwell/sge.conf -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

include required(classpath("application"))

system {
  input-read-limits {
    lines = 100000000
backend {
  default = SGE

  providers {
    SGE {
      actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
      config {
        concurrent-job-limit = 50

        runtime-attributes = """
        Int cpu = 1
        Float? memory_gb
        String? sge_queue
        String? sge_project
        String? docker
        String? docker_user

        submit = """
        qsub \
        -terse \
        -V \
        -b y \
        -N ${job_name} \
        -wd ${cwd} \
        -o ${out} \
        -e ${err} \
        -pe mpi ${cpu} \
        ${"-l mem_free=" + memory_gb + "g"} \
        ${"-q " + sge_queue} \
        ${"-P " + sge_project} \
        /usr/bin/env bash ${script}

        job-id-regex = "(\\d+)"

        kill = "qdel ${job_id}"
        check-alive = "qstat -j ${job_id}"

I modify the input-read-limits because my bed file is the agilent SureSelect human exome v6 bed file, which contains 243190 intervals, about 46M size.

I submit the job with the proper wdl file and input file. Everything seems ok.

It run the task DynamicallyCombineIntervals very quickly. But it takes about half an hour before I see the folder call-ImportGVCFs. I suppose the task call-ImportGVCFs will submit all intervals. In my case, DynamicallyCombineIntervals output contain 239964 intervals, it will submit 239964 jobs to sge. Actually, it submit very slowly. It takes about 4-5 min on average to submit a job, but the ImportGVCFs task uses only 1-2 min. Sometimes, it can submit dozens of jobs. Sometimes, it only submit 1 or 2 jobs. Besides, the shard number seems random. From 2018-07-01 14:11 to 2018-07-02 00:57, it only finish 2750 intervals ImportGVCFs task. Finally, the cromwell service give GC overhead limit exceeded error and exit.

I think the GC error occurs because the machine I run cromwell service on has low memory, about 32G. I will change to another machine with 188G memory to have a try.

As this is my first time to run GATK using wdl and cromwell on so many samples, I have no idea why it submit jobs to sge so slowly. Is it normal? Do you have some good suggestions?

The attach file is my workflow log. I provide the DynamicallyCombineIntervals out intervals file in the inputs file, so you can not see task DynamicallyCombineIntervals in the log.

Best Answer


  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    I am asking someone on the team to get back to you.


  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    Hey @conanyangqun,

    Is there any chance you can give the java process 4-8GB of memory?
    java -Dconfig.file=$cromwell/sge.conf -Xmx8g -Xms8g -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

  • conanyangqunconanyangqun chinaMember

    @Ruchi said:
    Hey @conanyangqun,

    Is there any chance you can give the java process 4-8GB of memory?
    java -Dconfig.file=$cromwell/sge.conf -Xmx8g -Xms8g -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

    Hi @Ruchi.
    Thanks for reply.
    Now, I set up a mysql service to store the cromwell info, and run the cromwell service in a machine with 188GB memory. From 20180704-18:00 to 20180706-09:00, the step call-ImportGVCFs has processed about 13759 intervals, the call-GenotypeGVCFs step 520 intervals, the call-HardFilterAndMakeSitesOnlyVcf 100 intervals. The speed is about 350 intervals per hour. And the GC overhead limit exceeded error has not occured.
    If this argument -Xmx8g -Xms8g will help promote the processing speed, I will have a try.

  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    @conanyangqun -- just noticed you mention 239964-- that seems quite large. Is this a specially curated interval list?

  • conanyangqunconanyangqun chinaMember

    @Ruchi This interval list is a bed file provided by aglient, you can find info here.

    I do not know the difference between it and Broad.human.exome.b37.interval_list. I will check it later. Broad.human.exome.b37.interval_list file has 189894 intervals before DynamicallyCombineIntervals step, it seems no big difference to me, that is, both files are large. :#

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    FWIW those are two different exome targets files; you should use the one that matches the exome capture kit used to prep the libraries for sequencing.

  • conanyangqunconanyangqun chinaMember

    @Geraldine_VdAuwera The agilent one is the right one for me and I use it to do the analysis. :D However, it seems to take about more than a month to finish the pipline in my local cluster. Cloud you please tell me the situation of your local environment? Maybe, I should use cloud computing, although there are some difficulties.

  • conanyangqunconanyangqun chinaMember

    @Geraldine_VdAuwera I have read a lot about wdl, cromwell and firecloud. As I am in China, the google cloud is not the proper way for me.

    I attended the workshop in Beijing at April, this year. Excited to see you! :#

    I know the alicloud has setted up cromwell service, but it does not have a frontend like the Firecloud. I can not estimate the cost, either.

    Anyway, cloud computing seems to the proper way to process so many samples. I will have a try sooner or later. :)

    Thank you for your help!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah right, that’s too bad. Well, we are developing a Cromwell job management interface that will make using Cromwell easier, so look out for that.

    We had a great time in Beijing, thanks for being one of the participants there!

Sign In or Register to comment.