Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

MuTect2 with Queue

eeuzceeuzc SwitzerlandMember
edited August 2016 in Ask the GATK team


I'm trying to run Mutect2 with matched normal-tumor whole genome samples and I want to use Queue. I know it will take a lot of time with WGS samples and I want to use Queue as mentioned here http://gatkforums.broadinstitute.org/gatk/discussion/6559/mutect2-runs-much-slower-than-mutect-1-17.

Mutect2 runs fine directly with my samples, and Queue works with the example files that came with the .bz2 file I downloaded. However, I cannot get the two to run together.

This is the error I get:

INFO 19:41:50,429 QScriptManager - Compiling 1 QScript
ERROR 19:41:51,062 QScriptManager - Mutect2.scala:20: type mismatch;
found : java.io.File
required: Seq[java.io.File]
ERROR 19:41:51,064 QScriptManager - mutect2.cosmic = new File("mutect/b37_cosmic_v54_120711.vcf")
ERROR 19:41:51,065 QScriptManager - ^
ERROR 19:41:51,142 QScriptManager - two errors found

ERROR stack trace

org.broadinstitute.gatk.queue.QException: Compile of ../Analysis/mutect/Mutect2.scala failed with 2 errors
at org.broadinstitute.gatk.queue.QScriptManager.loadScripts(QScriptManager.scala:79)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager$lzycompute(QCommandLine.scala:94)
at org.broadinstitute.gatk.queue.QCommandLine.org$broadinstitute$gatk$queue$QCommandLine$$qScriptPluginManager(QCommandLine.scala:92)
at org.broadinstitute.gatk.queue.QCommandLine.getArgumentSources(QCommandLine.scala:229)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:213)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:157)
at org.broadinstitute.gatk.queue.QCommandLine$.main(QCommandLine.scala:61)
at org.broadinstitute.gatk.queue.QCommandLine.main(QCommandLine.scala)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.6-0-g89b7209):
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://www.broadinstitute.org/gatk
ERROR MESSAGE: Compile of ../Analysis/mutect/Mutect2.scala failed with 2 errors
ERROR ------------------------------------------------------------------------------------------

INFO 19:41:51,235 QCommandLine - Shutting down jobs. Please wait...

There seems to be some problem with Java (I'm running Java 1.8) This is the scala code I wrote after seeing different discussions here. I haven't really worked with Scala.

   import org.broadinstitute.gatk.queue.QScript
   import org.broadinstitute.gatk.queue.extensions.gatk._
   class Mutect2 extends QScript {

  // Script Arguments passed from command line
    @Input(doc="Normal or unaffected sample", shortName="normal", required=true)
    var normalIn: File = _
    @Input(doc="Tumor or affected sample", shortName="tumor", required=true)
    var tumorIn: File = _
    @Argument(shortName = "L",  required=false, doc = "Intervals file")
    var intervalsFile: List[File] = Nil
    @Argument(shortName = "o",  required=true, doc = "Output file")
    var outputFile: File = _

    // Add functions hard-coded in the script
    def script() {
            val mutect2 = new MuTect2
            //mutect2.jarFile = new File("/software/UHTS/Analysis/GenomeAnalysisTK/3.6/bin/GenomeAnalysisTK")
            mutect2.R = new File("/scratch/cluster/monthly/fsantoni/index/hg19/human_g1k_v37.fasta")
            mutect2.cosmic = new File("/data3/unige/dmg/cmorey/Analysis/mutect/b37_cosmic_v54_120711.vcf")
            mutect2.dbsnp = new File("/data3/unige/dmg/cmorey/Analysis/mutect/dbsnp_132_b37.leftAligned.vcf.gz")
            mutect2.intervalsString = intervalsFile
            mutect2.input_file = List(new TaggedFile(tumorIn, "tumor"), new TaggedFile(normalIn, "normal"))
            mutect2.out = outputFile

And I run it with
java -jar Queue.jar -S ../Analysis/mutect/Mutect2.scala -normal normal.bam -tumor tumor.bam -o Output_mutect2.vcf


Best Answer


  • stoneWangstoneWang shicheng wangMember

    I am a newcomer of Mutect2, I want to know if I do not use queue, how much time does it take ?
    command follows:

    java -jar GenomeAnalysisTK-3.6/GenomeAnalysisTK.jar \
    -T MuTect2 \
    -R human_g1k_v37.fasta \
    -I:normal TCGA1.bam \
    -I:tumor TCGA2.bam \
    --dbsnpdbsnp_132_b37.leftAligned.vcf  \
    --cosmic b37_cosmic_v54_120711.vcf \
    -o output1.vcf

    i need a few weeks to finish, but i think even without queue it should not be so long, i do not know where is wrong
    here is information about log:

    INFO  22:54:58,582 MuTect2 - Using global mismapping rate of 45 => -4.5 in log10 likelihood units
    WARN  22:55:15,266 PairHMMLikelihoodCalculationEngine$1 - Failed to load native library for VectorLoglessPairHMM - using Java implementation of LOGLESS_CACHING
    INFO  22:57:58,065 ProgressMeter -         1:69812              0.0     3.0 m          297.7 w        0.0%    13.2 w      13.2 w
    INFO  22:58:58,100 ProgressMeter -         1:69812              0.0     4.0 m          397.0 w        0.0%    17.6 w      17.6 w
    INFO  22:59:58,125 ProgressMeter -         1:69812              0.0     5.0 m          496.3 w        0.0%    22.0 w      22.0 w

    vmem of computer is about 2G
    thank you very much

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    We don't have precise recommendations for runtime, but I can tell you that you will need to parallelize the run if you want to get done anytime soon.

    If you don't know how to do this I would recommend you look into the FireCloud platform.
Sign In or Register to comment.