Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Total missing values: 2

I am a new guy who are into gatk development. below is my work. When I run it, the error occurs, saying that :
ERROR 10:46:19,225 QGraph - Total missing values: 2
INFO 10:46:19,227 QCommandLine - Writing final jobs report...
INFO 10:46:19,227 QJobsReporter - Writing JobLogging GATKReport to file /home/lxd/文档/queue/testscala.jobreport.txt
INFO 10:46:19,266 QCommandLine - Done with errors
INFO 10:46:19,267 QCommandLine - Script failed with 8 total jobs

It is all about "add(realignedbam)"
When I delete this sentence, the dryrun will be correct
So what happened to "var realignedbam"? Where is my mistake?

import org.broadinstitute.sting.gatk.walkers.variantrecalibration
import org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorArgumentCollection
import org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorArgumentCollection.Mode
import org.broadinstitute.sting.queue.extensions.gatk.CountLoci
import org.broadinstitute.sting.queue.extensions.gatk._
import org.broadinstitute.sting.queue.function.JavaCommandLineFunction

class MyScript extends org.broadinstitute.sting.queue.QScript {
@Input(doc="The reference file.", shortName="R")
val reference:File= null
@Input(doc="The left reads.", shortName="lr")
val leftreads:File= null
@Input(doc="The right reads.", shortName="rr")
val rightreads:File= null

@Argument(doc="RGID",shortName="rgid",required = false)
val rgid= "id"
@Argument(doc="RGLB",shortName="rglb",required = false)
val rglb="solexa-123"
@Argument(doc="RGPL",shortName = "rgpl",required=false)
val rgpl="illumina"
@Argument(doc="RGPU",shortName ="rgpu",required = false)
val rgpu="AXL2342"
@Argument(doc="RGSM",shortName ="rgsm",required = false)
val rgsm="NA12878"
@Argument(doc="RGCN",shortName ="rgcn",required = false)
val rgcn="China"
@Argument(doc="RGDT",shortName ="rgdt",required = false)
val rgdt="11/11/2013"

def script ={

add(new CommandLineFunction { def commandLine = "bwa aln " + reference + " " +leftreads +" > leftsai.sai"  })
add(new CommandLineFunction { def commandLine = "bwa aln " + reference + " " +rightreads +" > rightsai.sai"  })
add(new CommandLineFunction { def commandLine = "bwa sampe "+ reference + " leftsai.sai  rightsai.sai "+
  leftreads+" "+ rightreads+ " > combined.sam"})
add(new CommandLineFunction { def commandLine = "java -jar SortSam.jar I=combined.sam O=sorted.bam SO=coordinate"})
add(new CommandLineFunction { def commandLine = "java -jar MarkDuplicates.jar I=sorted.bam O=dedupped.bam"})
add(new CommandLineFunction { def commandLine = "java -jar AddOrReplaceReadGroups.jar I=dedupped.bam" +
  " O=decorated.bam RGID=" +rgid +" RGLB="+rglb+" RGPL="+ rgpl+" RGPU="+rgpu+" RGSM="+rgsm+" RGCN="+rgcn+" RGDT="+rgdt})

var realignerIntervals=new RealignerTargetCreator
realignerIntervals.reference_sequence=reference
val bamfile=new File("decorated.bam")
realignerIntervals.input_file=List(bamfile)
val vcf1000g=new File("1000G_phase1.indels.b37.vcf")
val vcfmills=new File("Mills_and_1000G_gold_standard.indels.b37.vcf")

realignerIntervals.known=List(vcf1000g,vcfmills)
var realignedbam=new IndelRealigner    


add(realignerIntervals)
add(realignedbam)

}
}

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    It looks like you're not defining the reference sequence and input to the realignedbam step. Those are required values that you must define in your script. Have a look at the presentations here for more details:

    http://www.broadinstitute.org/gatk/guide/events?id=3391

  • zhangrui9xzhangrui9x ChinaMember
    edited November 2013

    Thank you very much for your attention. However, actually I added the reference sequence and input to the realignedbam step in the first place and the same errors occured . The reason why I delete those two sentences is that I want to figure out which step is wrong.
    The same errors occured with the codes below:

    import org.broadinstitute.sting.gatk.walkers.variantrecalibration 
    import org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorArgumentCollection 
    import org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorArgumentCollection.Mode 
    import org.broadinstitute.sting.queue.extensions.gatk.CountLoci 
    import org.broadinstitute.sting.queue.extensions.gatk._ 
    import org.broadinstitute.sting.queue.function.JavaCommandLineFunction
    
    class MyScript extends org.broadinstitute.sting.queue.QScript { 
    
        @Input(doc="The reference file.", shortName="R") 
        val reference:File= null 
    
        @Input(doc="The left reads.", shortName="lr") 
        val leftreads:File= null 
    
        @Input(doc="The right reads.", shortName="rr") 
        val rightreads:File= null
    
        @Argument(doc="RGID",shortName="rgid",required = false) 
        val rgid= "id" 
    
        @Argument(doc="RGLB",shortName="rglb",required = false) 
        val rglb="solexa-123" 
    
        @Argument(doc="RGPL",shortName = "rgpl",required=false) 
        val rgpl="illumina" 
    
        @Argument(doc="RGPU",shortName ="rgpu",required = false) 
        val rgpu="AXL2342" 
    
        @Argument(doc="RGSM",shortName ="rgsm",required = false) 
        val rgsm="NA12878" 
    
        @Argument(doc="RGCN",shortName ="rgcn",required = false) 
        val rgcn="China" 
    
        @Argument(doc="RGDT",shortName ="rgdt",required = false) 
        val rgdt="11/11/2013"
    
        def script ={
    
            add(new CommandLineFunction { def commandLine = "bwa aln " + reference + " " +leftreads +" > leftsai.sai"  })
            add(new CommandLineFunction { def commandLine = "bwa aln " + reference + " " +rightreads +" > rightsai.sai"  })
            add(new CommandLineFunction { def commandLine = "bwa sampe "+ reference + " leftsai.sai  rightsai.sai "+
              leftreads+" "+ rightreads+ " > combined.sam"})
            add(new CommandLineFunction { def commandLine = "java -jar SortSam.jar I=combined.sam O=sorted.bam SO=coordinate"})
            add(new CommandLineFunction { def commandLine = "java -jar MarkDuplicates.jar I=sorted.bam O=dedupped.bam"})
            add(new CommandLineFunction { def commandLine = "java -jar AddOrReplaceReadGroups.jar I=dedupped.bam" +
              " O=decorated.bam RGID=" +rgid +" RGLB="+rglb+" RGPL="+ rgpl+" RGPU="+rgpu+" RGSM="+rgsm+" RGCN="+rgcn+" RGDT="+rgdt})
    
            var realignerIntervals=new RealignerTargetCreator
            realignerIntervals.reference_sequence=reference
            val bamfile=new File("decorated.bam")
            realignerIntervals.input_file=List(bamfile)
            val vcf1000g=new File("1000G_phase1.indels.b37.vcf")
            val vcfmills=new File("Mills_and_1000G_gold_standard.indels.b37.vcf")
    
            realignerIntervals.known=List(vcf1000g,vcfmills)
            var realignedbam=new IndelRealigner    
            realignedbam.reference_sequence=reference
            realignedbam.input_file=List(bamfile)
            realignedbam.known=List(vcf1000g,vcfmills)
            realignedbam.targetIntervals=realignerIntervals.out
    
            add(realignerIntervals)
            add(realignedbam)
    
    } }
    

    I am really confuuuuused!!!

    Post edited by Geraldine_VdAuwera on
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Can you post the command line you ran the Qscript with, and the full console output?

  • zhangrui9xzhangrui9x ChinaMember

    java -jar Queue.jar -S testscala.scala -R human_g1k_v37.fasta -lr C166_Modified_1.fq -rr C166_Modified_2.fq -startFromScratch

    INFO 09:54:53,588 QScriptManager - Compiling 1 QScript
    INFO 09:54:57,871 QScriptManager - Compilation complete
    INFO 09:54:57,979 HelpFormatter - ----------------------------------------------------------------------
    INFO 09:54:57,979 HelpFormatter - Queue v2.7-2-g6bda569, Compiled 2013/08/28 16:33:34
    INFO 09:54:57,979 HelpFormatter - Copyright (c) 2012 The Broad Institute
    INFO 09:54:57,980 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO 09:54:57,980 HelpFormatter - Program Args: -S testscala.scala -R human_g1k_v37.fasta -lr C166_Modified_1.fq -rr C166_Modified_2.fq -startFromScratch
    INFO 09:54:57,980 HelpFormatter - Date/Time: 2013/11/21 09:54:57
    INFO 09:54:57,980 HelpFormatter - ----------------------------------------------------------------------
    INFO 09:54:57,980 HelpFormatter - ----------------------------------------------------------------------
    INFO 09:54:57,987 QCommandLine - Scripting MyScript
    INFO 09:54:58,081 QCommandLine - Added 8 functions
    INFO 09:54:58,081 QGraph - Generating graph.
    ERROR 09:54:58,104 QGraph - Missing 2 values for function: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/lxd/文档/queue/.queue/tmp' '-cp' '/home/lxd/文档/queue/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'IndelRealigner' '-I' '/home/lxd/文档/queue/decorated.bam' '-R' '/home/lxd/文档/queue/human_g1k_v37.fasta' '-known' '/home/lxd/文档/queue/1000G_phase1.indels.b37.vcf' '-known' '/home/lxd/文档/queue/Mills_and_1000G_gold_standard.indels.b37.vcf'
    ERROR 09:54:58,105 QGraph - @Argument: targetIntervalsString - Intervals file output from RealignerTargetCreator
    ERROR 09:54:58,106 QGraph - @Input: targetIntervals - Intervals file output from RealignerTargetCreator
    INFO 09:54:58,113 QGraph - Will remove outputs from previous runs.
    INFO 09:54:58,122 QGraph - -------
    INFO 09:54:58,122 QGraph - Pending: bwa aln human_g1k_v37.fasta C166_Modified_1.fq > leftsai.sai
    INFO 09:54:58,123 QGraph - Log: /home/lxd/文档/queue/testscala-1.out
    INFO 09:54:58,123 QGraph - -------
    INFO 09:54:58,123 QGraph - Pending: bwa aln human_g1k_v37.fasta C166_Modified_2.fq > rightsai.sai
    INFO 09:54:58,123 QGraph - Log: /home/lxd/文档/queue/testscala-2.out
    INFO 09:54:58,124 QGraph - -------
    INFO 09:54:58,124 QGraph - Pending: bwa sampe human_g1k_v37.fasta leftsai.sai rightsai.sai C166_Modified_1.fq C166_Modified_2.fq > combined.sam
    INFO 09:54:58,124 QGraph - Log: /home/lxd/文档/queue/testscala-3.out
    INFO 09:54:58,124 QGraph - -------
    INFO 09:54:58,125 QGraph - Pending: java -jar SortSam.jar I=combined.sam O=sorted.bam SO=coordinate
    INFO 09:54:58,125 QGraph - Log: /home/lxd/文档/queue/testscala-4.out
    INFO 09:54:58,125 QGraph - -------
    INFO 09:54:58,125 QGraph - Pending: java -jar MarkDuplicates.jar I=sorted.bam O=dedupped.bam
    INFO 09:54:58,126 QGraph - Log: /home/lxd/文档/queue/testscala-5.out
    INFO 09:54:58,126 QGraph - -------
    INFO 09:54:58,126 QGraph - Pending: java -jar AddOrReplaceReadGroups.jar I=dedupped.bam O=decorated.bam RGID=id RGLB=solexa-123 RGPL=illumina RGPU=AXL2342 RGSM=NA12878 RGCN=China RGDT=11/11/2013
    INFO 09:54:58,126 QGraph - Log: /home/lxd/文档/queue/testscala-6.out
    INFO 09:54:58,127 QGraph - -------
    INFO 09:54:58,130 QGraph - Pending: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/lxd/文档/queue/.queue/tmp' '-cp' '/home/lxd/文档/queue/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'RealignerTargetCreator' '-I' '/home/lxd/文档/queue/decorated.bam' '-R' '/home/lxd/文档/queue/human_g1k_v37.fasta' '-known' '/home/lxd/文档/queue/1000G_phase1.indels.b37.vcf' '-known' '/home/lxd/文档/queue/Mills_and_1000G_gold_standard.indels.b37.vcf'
    INFO 09:54:58,131 QGraph - Log: /home/lxd/文档/queue/testscala-7.out
    INFO 09:54:58,131 QGraph - -------
    INFO 09:54:58,133 QGraph - Pending: 'java' '-Xmx2048m' '-XX:+UseParallelOldGC' '-XX:ParallelGCThreads=4' '-XX:GCTimeLimit=50' '-XX:GCHeapFreeLimit=10' '-Djava.io.tmpdir=/home/lxd/文档/queue/.queue/tmp' '-cp' '/home/lxd/文档/queue/Queue.jar' 'org.broadinstitute.sting.gatk.CommandLineGATK' '-T' 'IndelRealigner' '-I' '/home/lxd/文档/queue/decorated.bam' '-R' '/home/lxd/文档/queue/human_g1k_v37.fasta' '-known' '/home/lxd/文档/queue/1000G_phase1.indels.b37.vcf' '-known' '/home/lxd/文档/queue/Mills_and_1000G_gold_standard.indels.b37.vcf'
    INFO 09:54:58,133 QGraph - Log: /home/lxd/文档/queue/testscala-8.out
    ERROR 09:54:58,133 QGraph - Total missing values: 2
    INFO 09:54:58,135 QCommandLine - Writing final jobs report...
    INFO 09:54:58,135 QJobsReporter - Writing JobLogging GATKReport to file /home/lxd/文档/queue/testscala.jobreport.txt
    INFO 09:54:58,176 QCommandLine - Done with errors
    INFO 09:54:58,177 QCommandLine - Script failed with 8 total jobs

Sign In or Register to comment.