We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

HaplotypeCallerSpark throws error Unable to find class: htsjdk.samtools.reference.AbstractFastaSeque

Hi,

I am trying to run HaplotypeCallerSpark on Apache Spark cluster but the job failed with below error. However, when I added the htsjdk-2.14.0.jar to spark it throws a different error. Exception in thread "main" java.lang.NoSuchMethodError: htsjdk.samtools.util.IOUtil.isBlockCompressed(Ljava/nio/file/Path;Z)Z

Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: htsjdk.samtools.reference.AbstractFastaSequenceFile$$Lambda$85/2028177366
Serialization trace:
initializer (htsjdk.samtools.util.Lazy)
dictionary (htsjdk.samtools.reference.IndexedFastaSequenceFile)
sequenceFile (org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile)
val$taskReferenceSequenceFile (org.broadinstitute.hellbender.tools.HaplotypeCallerSpark$1)
at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:246)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$8.apply(TorrentBroadcast.scala:293)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1337)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:294)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:226)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
... 30 more
Caused by: java.lang.ClassNotFoundException: htsjdk.samtools.reference.AbstractFastaSequenceFile$$Lambda$85/2028177366
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
... 50 more
19/12/16 07:06:57 Thread-2 INFO ShutdownHookManager: Shutdown hook called


Added htsjdk-2.14.0.jar to Spark But received the below Error.

[December 16, 2019 8:40:38 AM UTC] org.broadinstitute.hellbender.tools.HaplotypeCallerSpark done. Elapsed time: 0.15 minutes.
Runtime.totalMemory()=3104309248
Exception in thread "main" java.lang.NoSuchMethodError: htsjdk.samtools.util.IOUtil.isBlockCompressed(Ljava/nio/file/Path;Z)Z
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.checkFastaPath(CachingIndexedFastaSequenceFile.java:180)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:129)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:111)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.<init>(CachingIndexedFastaSequenceFile.java:96)
at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.assemblyRegionEvaluatorSupplierBroadcast(HaplotypeCallerSpark.java:229)
at org.broadinstitute.hellbender.engine.spark.AssemblyRegionWalkerSpark.getAssemblyRegions(AssemblyRegionWalkerSpark.java:130)
at org.broadinstitute.hellbender.engine.spark.AssemblyRegionWalkerSpark.runTool(AssemblyRegionWalkerSpark.java:138)
at org.broadinstitute.hellbender.tools.HaplotypeCallerSpark.runTool(HaplotypeCallerSpark.java:150)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:533)
at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:31)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/12/16 08:40:38 Thread-2 INFO ShutdownHookManager: Shutdown hook called

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited December 2019

    Hi @sjbosco

    Please post the exact commands and the version of GATK you are using.

  • sjboscosjbosco IndiaMember
    The command executed.

    ./gatk HaplotypeCallerSpark --input /data2/gatkBaseRecall/sorted.deduped.bqsr.bam -R /data2/gatk/gatk-4.1.4.1/Homo_sapiens_assembly19.fasta --emit-ref-confidence GVCF --output /data2/gatkBaseRecall/part/genome.vcf -- --spark-runner SPARK --spark-master spark://<myhost>:7077 --total-executor-cores 80 --executor-memory 10g

    Also doesnt HaplotypeCallerSpark --dbsnp arguments
  • sjboscosjbosco IndiaMember
    hi @bhanuGandham

    Below are the details

    gatk-4.1.4.1
    Apache Spark-2.2.1
    hadoop-2.7.3
    java version "1.8.0_112"
    Ubuntu 16.04.4 LTS
  • LouisBLouisB Broad InstituteMember, Broadie, Dev ✭✭✭

    Hi @sjbosco. This looks like a bug on our end although I'm not sure why we don't see it in our tests. We test agains spark 2.4.3 so I wonder if there is an incompatibility there.

    You shouldn't need to specify a separate Htsjdk jar since the correct one is included in the gatk jar we distribute.

    I'll see if I can reproduce with a similar command line on my machine and get back to you.

  • sjboscosjbosco IndiaMember

    @LouisB Thanks for the update.

    I can share the list of exisiting jars in my current spark classpath if it helps debugging in compatibility.

    Also -dbsnp argument is not supported on spark version could you confirm.

    Thanks.

  • LouisBLouisB Broad InstituteMember, Broadie, Dev ✭✭✭

    @sjbosco Sorry for the slow update, I'm traveling to see family and have ended up having less time working than I expected to. I reproduce a similar error locally when I run a command similar to what you are running. It's not the exact same problem but seems to be a similar issue. I think the underlying problem is that we're accidentally trying to serialize something which is fundamentally not serializable. I haven't tracked down the cause yet. I also don't yet understand why we don't see this issue when we run our tests.

    I'll continue to look into it but I probably won't have much time until January.

  • sjboscosjbosco IndiaMember

    @LouisB Thanks for the update.

    I am sure I can wait for the next update, even I am trying to take some time off for vacation.

    Wishing you Merry Christmas and New Year.

  • LouisBLouisB Broad InstituteMember, Broadie, Dev ✭✭✭

    I've opened a github ticket since this is definitely a bug on our end and not a problem with your configuration.

  • LouisBLouisB Broad InstituteMember, Broadie, Dev ✭✭✭

    Thank you! Merry Christmas and New Year to you as well. I hope you get some time to relax without any spark bugs interrupting you!

Sign In or Register to comment.