To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Qscript of Picard tools

dklevebringdklevebring Member
edited November 2013 in Ask the GATK team


So I've finally taken the plunge and migrated our analysis pipeline to Queue. With some great feedback from @johandahlberg, I have gotten to a state where most of the stuff is running smoothly on the cluster.

I'm trying to add Picard's CalculateHSMetrics to the pipeline, but am having some issues. This code:

case class hsmetrics(inBam: File, baitIntervals: File, targetIntervals: File, outMetrics: File) extends CalculateHsMetrics with ExternalCommonArgs with SingleCoreJob with OneDayJob {
    @Input(doc="Input BAM file") val bam: File = inBam
    @Output(doc="Metrics file") val metrics: File = outMetrics
    this.input :+= bam
    this.targets = targetIntervals
    this.baits = baitIntervals
    this.output = metrics
    this.reference =  refGenome
    this.isIntermediate = false

Gives the following error message:

ERROR 06:56:25,047 QGraph - Missing 2 values for function:  'java'  '-Xmx2048m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  '' null 'INPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.bam'  'TMP_DIR=/Users/dankle/IdeaProjects/eclipse/AutoSeq/.queue/tmp'  'VALIDATION_STRINGENCY=SILENT'  'OUTPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.preMarkDupsHsMetrics.metrics'  'BAIT_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals'  'TARGET_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals'  'REFERENCE_SEQUENCE=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/bwaindex0.6/exampleFASTA.fasta'  'METRIC_ACCUMULATION_LEVEL=SAMPLE'  
ERROR 06:56:25,048 QGraph -   @Argument: jarFile - jar 
ERROR 06:56:25,049 QGraph -   @Argument: javaMainClass - Main class to run from javaClasspath 

And yeah, is seems that the jar file is currently set to null in the command line. However, MarkDuplicates runs fine without setting the jar:

case class dedup(inBam: File, outBam: File, metricsFile: File) extends MarkDuplicates with ExternalCommonArgs with SingleCoreJob with OneDayJob {
    @Input(doc = "Input bam file") var inbam = inBam
    @Output(doc = "Output BAM file with dups removed") var outbam = outBam
    this.REMOVE_DUPLICATES = true
    this.input :+= inBam
    this.output = outBam
    this.metrics = metricsFile
    this.memoryLimit = 3
    this.isIntermediate = false

Why does CalculateHSMetrics need the jar, but not MarkDuplicates? Both are imported with import org.broadinstitute.sting.queue.extensions.picard._.

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Huh, so it is indeed missing the class line. I'll patch that in the codebase; in the meantime @pdexheimer's suggestion should work to get your script up and working. (thanks Phil!)

  • Hmm… I'm now getting this:

    [Wed Nov 20 16:37:13 CET 2013] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals TARGET_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals INPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.bam OUTPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.hsMetricsPreMarkDups.metrics METRIC_ACCUMULATION_LEVEL=[SAMPLE, ALL_READS] REFERENCE_SEQUENCE=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/bwaindex0.6/exampleFASTA.fasta TMP_DIR=[/Users/dankle/IdeaProjects/eclipse/AutoSeq/.queue/tmp] VALIDATION_STRINGENCY=SILENT    VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    [Wed Nov 20 16:37:13 CET 2013] Executing as dankle@LM0004MEB.local on Mac OS X 10.8.5 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b43; Picard version: 1.96(1534)
    [Wed Nov 20 16:37:13 CET 2013] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0,00 minutes.
    To get help, see
    Exception in thread "main" java.lang.NullPointerException
    at net.sf.picard.metrics.MultiLevelCollector$Distributor.acceptRecord(
    at net.sf.picard.metrics.MultiLevelCollector.acceptRecord(
    at net.sf.picard.analysis.directed.CollectTargetedMetrics.doWork(
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(

    Any ideas?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Can you try running the Picard command (using the same inputs, parameters etc) directly from command line? This to check whether it's the Picard tool bugging out or Queue is misbehaving.

  • Got the same error. sigh It seems this is on picard. Thanks.

  • For future reference: The latter error happens when METRIC_ACCUMULATION_LEVEL=SAMPLE is set but no read groups are present in the BAM file. If the metric accumulation level is unset, or read groups are added, it runs fine. Sorry about the non-GATK-related part of this post, and thanks for the help identifying the initial issue.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ah, thanks for reporting your solution. Feel free to tell the Picard devs they need to add more graceful handling for that error case.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Update: we reported the issue to the Picard team, and they have developed a fix to handle this error case. Now, for any reads that are missing read group, there will be a row at whatever level of accumulation is requested with "unknown" in the appropriate columns.

  • Great work, both teams!

Sign In or Register to comment.