We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Picard CalculateHsMetrics for targeted region larger than 2147483647bp.

mnw21cammnw21cam Exeter UniversityMember ✭✭

I recently tried to obtain coverage metrics for a whole genome sequencing project (regular WGS, no hybrid selection), using Picard CalculateHsMetrics. Command line:

java -Xmx30g -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -jar picard.jar CalculateHsMetrics I=something.bam O=something.HS_metrics BAIT_INTERVALS=genome.interval_list TARGET_INTERVALS=genome.interval_list VALIDATION_STRINGENCY=SILENT METRIC_ACCUMULATION_LEVEL=SAMPLE

This spends two hours calculating, and then fails in picard/analysis/directed/TargetMetricsCollector.java line 423:

final short[] depths = new short[(int) this.metrics.TARGET_TERRITORY]; // may not use entire array

Unfortunately in this case, this.metrics.TARGET_TERRITORY is the whole human genome, which is larger than Integer.MAX_INT, which rolls around to be a negative number, causing a java.lang.NegativeArraySizeException.

Now, the limit on the size of a java array is fixed by the fact that arrays are indexed by an int, not a long. Possible fixes to this issue are splitting the depths array into multiple sub-arrays.

Are there any plans to implement a fix to remove this limitation on CalculateHsMetrics? I'm only looking for mean read depth and coverage at 20X. Currently, I'm running two separate Picard runs with two halves of the genome, and then combining the results, which is a little bit of a shame.

Thanks,

Matthew

Best Answer

Answers

  • mnw21cammnw21cam Exeter UniversityMember ✭✭

    Thanks for the answer. The reason I am trying to use Picard is in order for the results to be completely comparable to the results from other targeted sequencing projects. Also, Picard runs a lot quicker than DepthOfCoverage.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Oh I see. Well if you just want mean coverage, I would recommend looking at other tools that are more appropriate for that purpose. We have no plans to modify this tool to enable doing something that it's not designed to do.

Sign In or Register to comment.