The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Download the latest Picard release at https://github.com/broadinstitute/picard/releases.
GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

Qscript of Picard tools

dklevebringdklevebring Member
edited November 2013 in Ask the GATK team

Hi,

So I've finally taken the plunge and migrated our analysis pipeline to Queue. With some great feedback from @johandahlberg, I have gotten to a state where most of the stuff is running smoothly on the cluster.

I'm trying to add Picard's CalculateHSMetrics to the pipeline, but am having some issues. This code:

case class hsmetrics(inBam: File, baitIntervals: File, targetIntervals: File, outMetrics: File) extends CalculateHsMetrics with ExternalCommonArgs with SingleCoreJob with OneDayJob {
    @Input(doc="Input BAM file") val bam: File = inBam
    @Output(doc="Metrics file") val metrics: File = outMetrics
    this.input :+= bam
    this.targets = targetIntervals
    this.baits = baitIntervals
    this.output = metrics
    this.reference =  refGenome
    this.isIntermediate = false
}

Gives the following error message:

ERROR 06:56:25,047 QGraph - Missing 2 values for function:  'java'  '-Xmx2048m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  '-Djava.io.tmpdir=/Users/dankle/IdeaProjects/eclipse/AutoSeq/.queue/tmp' null 'INPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.bam'  'TMP_DIR=/Users/dankle/IdeaProjects/eclipse/AutoSeq/.queue/tmp'  'VALIDATION_STRINGENCY=SILENT'  'OUTPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.preMarkDupsHsMetrics.metrics'  'BAIT_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals'  'TARGET_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals'  'REFERENCE_SEQUENCE=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/bwaindex0.6/exampleFASTA.fasta'  'METRIC_ACCUMULATION_LEVEL=SAMPLE'  
ERROR 06:56:25,048 QGraph -   @Argument: jarFile - jar 
ERROR 06:56:25,049 QGraph -   @Argument: javaMainClass - Main class to run from javaClasspath 

And yeah, is seems that the jar file is currently set to null in the command line. However, MarkDuplicates runs fine without setting the jar:

case class dedup(inBam: File, outBam: File, metricsFile: File) extends MarkDuplicates with ExternalCommonArgs with SingleCoreJob with OneDayJob {
    @Input(doc = "Input bam file") var inbam = inBam
    @Output(doc = "Output BAM file with dups removed") var outbam = outBam
    this.REMOVE_DUPLICATES = true
    this.input :+= inBam
    this.output = outBam
    this.metrics = metricsFile
    this.memoryLimit = 3
    this.isIntermediate = false
}

Why does CalculateHSMetrics need the jar, but not MarkDuplicates? Both are imported with import org.broadinstitute.sting.queue.extensions.picard._.

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Huh, so it is indeed missing the class line. I'll patch that in the codebase; in the meantime @pdexheimer's suggestion should work to get your script up and working. (thanks Phil!)

  • Hmm… I'm now getting this:

    [Wed Nov 20 16:37:13 CET 2013] net.sf.picard.analysis.directed.CalculateHsMetrics BAIT_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals TARGET_INTERVALS=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/exampleINTERVAL.intervals INPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.bam OUTPUT=/Users/dankle/tmp/autoseqscala/exampleIND2/exampleIND2.panel.hsMetricsPreMarkDups.metrics METRIC_ACCUMULATION_LEVEL=[SAMPLE, ALL_READS] REFERENCE_SEQUENCE=/Users/dankle/IdeaProjects/eclipse/AutoSeq/resources/bwaindex0.6/exampleFASTA.fasta TMP_DIR=[/Users/dankle/IdeaProjects/eclipse/AutoSeq/.queue/tmp] VALIDATION_STRINGENCY=SILENT    VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    [Wed Nov 20 16:37:13 CET 2013] Executing as dankle@LM0004MEB.local on Mac OS X 10.8.5 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b43; Picard version: 1.96(1534)
    [Wed Nov 20 16:37:13 CET 2013] net.sf.picard.analysis.directed.CalculateHsMetrics done. Elapsed time: 0,00 minutes.
    Runtime.totalMemory()=128974848
    To get help, see http://picard.sourceforge.net/index.shtml#GettingHelp
    Exception in thread "main" java.lang.NullPointerException
    at net.sf.picard.metrics.MultiLevelCollector$Distributor.acceptRecord(MultiLevelCollector.java:146)
    at net.sf.picard.metrics.MultiLevelCollector.acceptRecord(MultiLevelCollector.java:277)
    at net.sf.picard.analysis.directed.CollectTargetedMetrics.doWork(CollectTargetedMetrics.java:123)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177)
    at net.sf.picard.analysis.directed.CalculateHsMetrics.main(CalculateHsMetrics.java:74)
    

    Any ideas?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Can you try running the Picard command (using the same inputs, parameters etc) directly from command line? This to check whether it's the Picard tool bugging out or Queue is misbehaving.

  • Got the same error. sigh It seems this is on picard. Thanks.

  • For future reference: The latter error happens when METRIC_ACCUMULATION_LEVEL=SAMPLE is set but no read groups are present in the BAM file. If the metric accumulation level is unset, or read groups are added, it runs fine. Sorry about the non-GATK-related part of this post, and thanks for the help identifying the initial issue.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ah, thanks for reporting your solution. Feel free to tell the Picard devs they need to add more graceful handling for that error case.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Update: we reported the issue to the Picard team, and they have developed a fix to handle this error case. Now, for any reads that are missing read group, there will be a row at whatever level of accumulation is requested with "unknown" in the appropriate columns.

  • Great work, both teams!

Sign In or Register to comment.