Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on October 14, 2019, due to the U.S. holiday. We will return to monitoring the forum on October 15.

Mutect 2 "Cannot read non-existent {bam} file"

TintestTintest FranceMember

Hello,

I made several pipeline using Singularity images, Nextflow with gatk tools.

I got a very silly error with Mutect2

Running:
      java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /opt/conda/share/gatk4-4.0.9.0-0/gatk-package-4.0.9.0-local.jar Mutect2 -R /bettik/tintest/PROJECTS/Test_nextflow_OAR/REF/hg38/hg38.fasta -I S668_D4B_C000F4B_MD_BSQR2.bam -I S668_D4C_C000F4C_MD_BSQR2.bam -tumor S668_D4B -normal S668_D4C -L 1 -pon SPARK_GATK_pon.vcf.gz --germline-resource SPARK_GATK_gnomad_hg38.vcf.gz --af-of-alleles-not-in-resource 0.0000025 --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter -O patient1_S668_D4B_S668_D4C_1.vcf.gz -bamout patient1_S668_D4B_S668_D4C_1.bam
  15:11:26.144 WARN  GATKReadFilterPluginDescriptor - Disabled filter (MateOnSameContigOrNoMappedMateReadFilter) is not enabled by this tool
  15:11:26.235 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/conda/share/gatk4-4.0.9.0-0/gatk-package-4.0.9.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
  15:11:28.747 INFO  Mutect2 - ------------------------------------------------------------
  15:11:28.747 INFO  Mutect2 - The Genome Analysis Toolkit (GATK) v4.0.9.0
  15:11:28.747 INFO  Mutect2 - For support and documentation go to https://software.broadinstitute.org/gatk/
  15:11:28.747 INFO  Mutect2 - Executing as [email protected] on Linux v4.9.0-8-amd64 amd64
  15:11:28.748 INFO  Mutect2 - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12
  15:11:28.748 INFO  Mutect2 - Start Date/Time: November 6, 2018 3:11:26 PM UTC
  15:11:28.748 INFO  Mutect2 - ------------------------------------------------------------
  15:11:28.748 INFO  Mutect2 - ------------------------------------------------------------
  15:11:28.748 INFO  Mutect2 - HTSJDK Version: 2.16.1
  15:11:28.748 INFO  Mutect2 - Picard Version: 2.18.13
  15:11:28.749 INFO  Mutect2 - HTSJDK Defaults.COMPRESSION_LEVEL : 2
  15:11:28.749 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
  15:11:28.749 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
  15:11:28.749 INFO  Mutect2 - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
  15:11:28.749 INFO  Mutect2 - Deflater: IntelDeflater
  15:11:28.749 INFO  Mutect2 - Inflater: IntelInflater
  15:11:28.749 INFO  Mutect2 - GCS max retries/reopens: 20
  15:11:28.749 INFO  Mutect2 - Requester pays: disabled
  15:11:28.749 INFO  Mutect2 - Initializing engine
  15:11:28.835 INFO  Mutect2 - Shutting down engine
  [November 6, 2018 3:11:28 PM UTC] org.broadinstitute.hellbender.tools.walkers.mutect.Mutect2 done. Elapsed time: 0.04 minutes.
  Runtime.totalMemory()=1962934272
  ***********************************************************************

  A USER ERROR has occurred: Couldn't read file. Error was: S668_D4B_C000F4B_MD_BSQR2.bam with exception: Cannot read non-existent file: file://S668_D4B_C000F4B_MD_BSQR2.bam

  ***********************************************************************
  Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

However a simlink of my bam file is in my workdir :

ll /bettik/tintest/PROJECTS/Test_nextflow_OAR/work/5d/a31934453d8db9ebfffcc2809a8da4
total 16K
drwxr-xr-x 2 tintest l-iab   14 Nov  6 16:11 .
drwxr-xr-x 3 tintest l-iab    1 Nov  6 16:11 ..
-rw-r--r-- 1 tintest l-iab    0 Nov  6 16:11 .command.begin
-rw-r--r-- 1 tintest l-iab 3.2K Nov  6 16:11 .command.err
-rw-r--r-- 1 tintest l-iab    0 Nov  6 16:11 .command.out
-rwx------ 1 tintest l-iab 3.6K Nov  6 16:11 .command.run
-rw-r--r-- 1 tintest l-iab  485 Nov  6 16:11 .command.sh
-rw-r--r-- 1 tintest l-iab    1 Nov  6 16:11 .exitcode
-rw-r--r-- 1 tintest l-iab 3.2K Nov  6 16:11 OAR.nf-mutect2_1.8458699.stderr
-rw-r--r-- 1 tintest l-iab    0 Nov  6 16:11 OAR.nf-mutect2_1.8458699.stdout
lrwxrwxrwx 1 tintest l-iab   60 Nov  6 16:11 S668_D4B_C000F4B_MD_BSQR2.bam -> /bettik/tintest/SPARK/illumina/S668_D4B_C000F4B_MD_BSQR2.bam
lrwxrwxrwx 1 tintest l-iab   60 Nov  6 16:11 S668_D4C_C000F4C_MD_BSQR2.bam -> /bettik/tintest/SPARK/illumina/S668_D4C_C000F4C_MD_BSQR2.bam
lrwxrwxrwx 1 tintest l-iab   60 Nov  6 16:11 SPARK_GATK_gnomad_hg38.vcf.gz -> /bettik/tintest/SPARK/illumina/SPARK_GATK_gnomad_hg38.vcf.gz
lrwxrwxrwx 1 tintest l-iab   64 Nov  6 16:11 SPARK_GATK_gnomad_hg38.vcf.gz.tbi -> /bettik/tintest/SPARK/illumina/SPARK_GATK_gnomad_hg38.vcf.gz.tbi
lrwxrwxrwx 1 tintest l-iab   52 Nov  6 16:11 SPARK_GATK_pon.vcf.gz -> /bettik/tintest/SPARK/illumina/SPARK_GATK_pon.vcf.gz
lrwxrwxrwx 1 tintest l-iab   56 Nov  6 16:11 SPARK_GATK_pon.vcf.gz.tbi -> /bettik/tintest/SPARK/illumina/SPARK_GATK_pon.vcf.gz.tbi

I know the cluster I'm using do use several file system for front nodes and archive nodes. All my data are on the archive nodes. This didn't cause me any problem for a "standard germline pipeline", using tools like MarkDuplicates, BaseRecalibrator, HaplotypeCaller ...

Do you have any solution ?

Thank you.

Tagged:

Answers

Sign In or Register to comment.