We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Error in ReadsPipelineSpark version 4.1.4

Hi.
Just downloaded the 4.1.4 version in order to test the performances of the ReadsPipelineSpark tool.
Very good indeed.
But... at the end of the pipeline , when the tool has to concat all the vcf parts, it throws the following exception:

A USER ERROR has occurred: Couldn't write file hdfs://cloudera08/gatk-test2/WES2019-022_S4_out.vcf because writing failed with exception concat: target file /gatk-test2/WES2019-022_S4_out.vcf.parts/output is empty
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInternal(FSNamesystem.java:2303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInt(FSNamesystem.java:2257)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concat(FSNamesystem.java:2219)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.concat(NameNodeRpcServer.java:829)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.concat(AuthorizationProviderProxyClientProtocol.java:285)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.concat(ClientNamenodeProtocolServerSideTranslatorPB.java:580)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)


org.broadinstitute.hellbender.exceptions.UserException$CouldNotCreateOutputFile: Couldn't write file hdfs://cloudera08/gatk-test2/WES2019-022_S4_out.vcf because writing failed with exception concat: target file /gatk-test2/WES2019-022_S4_out.vcf.parts/output is empty
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInternal(FSNamesystem.java:2303)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.concatInt(FSNamesystem.java:2257)
...
...

This is the command I used to run the job:
nohup /opt/gatk/gatk-4.1.4.0/gatk ReadsPipelineSpark --spark-runner SPARK --spark-master yarn --spark-submit-command spark2-submit -I hdfs://cloudera08/gatk-test2/WES2019-022_S4.bam -O hdfs://cloudera08/gatk-test2/WES2019-022_S4_out.vcf -R hdfs://cloudera08/gatk-test1/ucsc.hg19.fasta --known-sites hdfs://cloudera08/gatk-test1/dbsnp_150_hg19.vcf.gz --known-sites hdfs://cloudera08/gatk-test1/Mills_and_1000G_gold_standard.indels.hg19.vcf.gz --align true --emit-ref-confidence GVCF --standard-min-confidence-threshold-for-calling 50.0 --conf deploy-mode=cluster --conf "spark.driver.memory=2g" --conf "spark.executor.memory=18g" --conf "spark.storage.memoryFraction=1" --conf "spark.akka.frameSize=200" --conf "spark.default.parallelism=100" --conf "spark.core.connection.ack.wait.timeout=600" --conf "spark.yarn.executor.memoryOverhead=4096" --conf "spark.yarn.driver.memoryOverhead=400" > WES2019-022_S4.out

-bash-4.1$ hdfs dfs -ls /gatk-test2/
Found 7 items
-rw-r--r-- 3 hdfs supergroup 39673964 2019-09-19 15:45 /gatk-test2/RefGene_exons.bed
-rw-r--r-- 3 hdfs supergroup 38516963 2019-09-19 15:45 /gatk-test2/RefGene_exons.interval_list
-rw-r--r-- 3 hdfs supergroup 13569684570 2019-10-02 11:49 /gatk-test2/WES2019-022_S4.bam
-rw-r--r-- 3 hdfs supergroup 16 2019-10-02 11:58 /gatk-test2/WES2019-022_S4.bam.bai
drwxr-xr-x - hdfs supergroup 0 2019-10-15 16:21 /gatk-test2/WES2019-022_S4_out.vcf.parts

-bash-4.1$ hdfs dfs -ls /gatk-test2/WES2019-022_S4_out.vcf.parts/
Found 105 items
-rw-r--r-- 3 hdfs supergroup 0 2019-10-15 16:21 /gatk-test2/WES2019-022_S4_out.vcf.parts/_SUCCESS
-rw-r--r-- 3 hdfs supergroup 10632 2019-10-15 16:21 /gatk-test2/WES2019-022_S4_out.vcf.parts/header
-rw-r--r-- 3 hdfs supergroup 0 2019-10-15 16:21 /gatk-test2/WES2019-022_S4_out.vcf.parts/output
-rw-r--r-- 3 hdfs supergroup 21498665 2019-10-15 14:43 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00000
-rw-r--r-- 3 hdfs supergroup 25489817 2019-10-15 15:10 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00001
-rw-r--r-- 3 hdfs supergroup 35599315 2019-10-15 14:44 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00002
-rw-r--r-- 3 hdfs supergroup 25185088 2019-10-15 14:41 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00003
-rw-r--r-- 3 hdfs supergroup 70456674 2019-10-15 14:43 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00004
-rw-r--r-- 3 hdfs supergroup 41305463 2019-10-15 14:52 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00005
...
...
-rw-r--r-- 3 hdfs supergroup 41022593 2019-10-15 16:08 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00097
-rw-r--r-- 3 hdfs supergroup 46040755 2019-10-15 16:03 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00098
-rw-r--r-- 3 hdfs supergroup 63441406 2019-10-15 15:57 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00099
-rw-r--r-- 3 hdfs supergroup 44377853 2019-10-15 15:55 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00100
-rw-r--r-- 3 hdfs supergroup 22847475 2019-10-15 16:21 /gatk-test2/WES2019-022_S4_out.vcf.parts/part-r-00101

It seems like a bug.
Could you please verify and let me know?

Thanks a lot.
Alessandro

Tagged:

Answers

Sign In or Register to comment.