Problem encountered when running MarkDuplicatesSpark using 4.0.6

Hi Board folks,

I am trying to run the MarkDuplicatesSpark on my spark cluster and keep getting following exceptions:

Caused by: java.lang.NullPointerException
at org.broadinstitute.hellbender.utils.read.markduplicates.sparkrecords.Fragment.(Fragment.java:37)
at org.broadinstitute.hellbender.utils.read.markduplicates.sparkrecords.MarkDuplicatesSparkRecord.newFragment(MarkDuplicatesSparkRecord.java:32)
at org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSparkUtils.lambda$null$0(MarkDuplicatesSparkUtils.java:115)
at java.util.stream.ReferencePipeline$11$1.accept(ReferencePipeline.java:372)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSparkUtils.lambda$transformToDuplicateNames$17d832cf$1(MarkDuplicatesSparkUtils.java:123)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:143)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

I have tried both on my remote cluster
--spark-master spark://{my_cluster_ip}:7077 --spark-runner SPARK
and local spark
--spark-master local[16]

They both gave me the same error.

The version I am using the latest 4.0.6.

Answers

Sign In or Register to comment.