Can you give some suggestions on running gatk4-germline-joint-discovery pipeline on 799 human WES sa

Hi, GATK team.
I have a cohort of 799 human WES samples, and have generated the g.vcf file. Now I want to run the germline-joint-discovery pipeline here.

As my local cluster is running grid engine, so I run the cromwell using this sge.conf file below. The cmd is java -Dconfig.file=$cromwell/sge.conf -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

include required(classpath("application"))

system {
  input-read-limits {
    lines = 100000000
  }
}
backend {
  default = SGE

  providers {
    SGE {
      actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
      config {
        concurrent-job-limit = 50

        runtime-attributes = """
        Int cpu = 1
        Float? memory_gb
        String? sge_queue
        String? sge_project
        String? docker
        String? docker_user
        """

        submit = """
        qsub \
        -terse \
        -V \
        -b y \
        -N ${job_name} \
        -wd ${cwd} \
        -o ${out} \
        -e ${err} \
        -pe mpi ${cpu} \
        ${"-l mem_free=" + memory_gb + "g"} \
        ${"-q " + sge_queue} \
        ${"-P " + sge_project} \
        /usr/bin/env bash ${script}
        """

        job-id-regex = "(\\d+)"

        kill = "qdel ${job_id}"
        check-alive = "qstat -j ${job_id}"
      }
    }
  }
}

I modify the input-read-limits because my bed file is the agilent SureSelect human exome v6 bed file, which contains 243190 intervals, about 46M size.

I submit the job with the proper wdl file and input file. Everything seems ok.

It run the task DynamicallyCombineIntervals very quickly. But it takes about half an hour before I see the folder call-ImportGVCFs. I suppose the task call-ImportGVCFs will submit all intervals. In my case, DynamicallyCombineIntervals output contain 239964 intervals, it will submit 239964 jobs to sge. Actually, it submit very slowly. It takes about 4-5 min on average to submit a job, but the ImportGVCFs task uses only 1-2 min. Sometimes, it can submit dozens of jobs. Sometimes, it only submit 1 or 2 jobs. Besides, the shard number seems random. From 2018-07-01 14:11 to 2018-07-02 00:57, it only finish 2750 intervals ImportGVCFs task. Finally, the cromwell service give GC overhead limit exceeded error and exit.

I think the GC error occurs because the machine I run cromwell service on has low memory, about 32G. I will change to another machine with 188G memory to have a try.

As this is my first time to run GATK using wdl and cromwell on so many samples, I have no idea why it submit jobs to sge so slowly. Is it normal? Do you have some good suggestions?

The attach file is my workflow log. I provide the DynamicallyCombineIntervals out intervals file in the inputs file, so you can not see task DynamicallyCombineIntervals in the log.

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @conanyangqun
    Hi,

    I am asking someone on the team to get back to you.

    -Sheila

  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    Hey @conanyangqun,

    Is there any chance you can give the java process 4-8GB of memory?
    java -Dconfig.file=$cromwell/sge.conf -Xmx8g -Xms8g -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

  • conanyangqunconanyangqun chinaMember

    @Ruchi said:
    Hey @conanyangqun,

    Is there any chance you can give the java process 4-8GB of memory?
    java -Dconfig.file=$cromwell/sge.conf -Xmx8g -Xms8g -jar $cromwell/cromwell-32.jar server >cromwell.log 2>&1

    Hi @Ruchi.
    Thanks for reply.
    Now, I set up a mysql service to store the cromwell info, and run the cromwell service in a machine with 188GB memory. From 20180704-18:00 to 20180706-09:00, the step call-ImportGVCFs has processed about 13759 intervals, the call-GenotypeGVCFs step 520 intervals, the call-HardFilterAndMakeSitesOnlyVcf 100 intervals. The speed is about 350 intervals per hour. And the GC overhead limit exceeded error has not occured.
    If this argument -Xmx8g -Xms8g will help promote the processing speed, I will have a try.

  • RuchiRuchi Member, Broadie, Moderator, Dev admin

    @conanyangqun -- just noticed you mention 239964-- that seems quite large. Is this a specially curated interval list?

  • conanyangqunconanyangqun chinaMember

    @Ruchi This interval list is a bed file provided by aglient, you can find info here.

    I do not know the difference between it and Broad.human.exome.b37.interval_list. I will check it later. Broad.human.exome.b37.interval_list file has 189894 intervals before DynamicallyCombineIntervals step, it seems no big difference to me, that is, both files are large. :#

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    FWIW those are two different exome targets files; you should use the one that matches the exome capture kit used to prep the libraries for sequencing.

  • conanyangqunconanyangqun chinaMember

    @Geraldine_VdAuwera The agilent one is the right one for me and I use it to do the analysis. :D However, it seems to take about more than a month to finish the pipline in my local cluster. Cloud you please tell me the situation of your local environment? Maybe, I should use cloud computing, although there are some difficulties.

  • conanyangqunconanyangqun chinaMember

    @Geraldine_VdAuwera I have read a lot about wdl, cromwell and firecloud. As I am in China, the google cloud is not the proper way for me.

    I attended the workshop in Beijing at April, this year. Excited to see you! :#

    I know the alicloud has setted up cromwell service, but it does not have a frontend like the Firecloud. I can not estimate the cost, either.

    Anyway, cloud computing seems to the proper way to process so many samples. I will have a try sooner or later. :)

    Thank you for your help!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah right, that’s too bad. Well, we are developing a Cromwell job management interface that will make using Cromwell easier, so look out for that.

    We had a great time in Beijing, thanks for being one of the participants there!

Sign In or Register to comment.