We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

How do I reach my data when running GATK in docker?

Hi! I would like to try out GATK but I am new to Linux OS, generally I use Windows OS. I installed Ubuntu 18.04.3 LTS through Oracle VirtualBox 6.0 on a laptop running Windows 10. I followed the instructions on the https://software.broadinstitute.org/gatk/documentation/article?id=11090 website to install Docker and downloading the GATK container image. When I try to run the FastqToSam tool it looks like this:

[email protected]:~$ sudo docker run -v ~/home/lmi/NGS:/gatk/my_data -it broadinstitute/gatk:
[sudo] password for lmi:
(gatk) [email protected]:/gatk# gatk FastqToSam -F1 F1.fastq -F2 F2.fastq -O uBAM.bam -SM sample001 -RG rg0013
Using GATK jar /gatk/gatk-package-
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package- FastqToSam -F1 F1.fastq -F2 F2.fastq -O uBAM.bam -SM sample001 -RG rg0013
09:40:34.696 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/gatk/gatk-package-!/com/intel/gkl/native/libgkl_compression.so
[Thu Oct 10 09:40:34 UTC 2019] FastqToSam --FASTQ F1.fastq --FASTQ2 F2.fastq --OUTPUT uBAM.bam --READ_GROUP_NAME rg0013 --SAMPLE_NAME sample001 --USE_SEQUENTIAL_FASTQS false --SORT_ORDER queryname --MIN_Q 0 --MAX_Q 93 --STRIP_UNPAIRED_MATE_NUMBER false --ALLOW_AND_IGNORE_EMPTY_LINES false --VERBOSITY INFO --QUIET false --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 2 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX false --CREATE_MD5_FILE false --GA4GH_CLIENT_SECRETS client_secrets.json --help false --version false --showHidden false --USE_JDK_DEFLATER false --USE_JDK_INFLATER false
Oct 10, 2019 9:40:41 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
[Thu Oct 10 09:40:41 UTC 2019] Executing as [email protected] on Linux 5.0.0-31-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_191-8u191-b12-0ubuntu0.16.04.1-b12; Deflater: Intel; Inflater: Intel; Provider GCS is available; Picard version: Version:
[Thu Oct 10 09:40:41 UTC 2019] picard.sam.FastqToSam done. Elapsed time: 0.12 minutes.
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
htsjdk.samtools.SAMException: Cannot read non-existent file: file:///gatk/F1.fastq
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:483)
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:470)
at picard.sam.FastqToSam.doWork(FastqToSam.java:312)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:162)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:205)
at org.broadinstitute.hellbender.Main.main(Main.java:291)
(gatk) [email protected]:/gatk#

If I understand it correctly from "Cannot read non-existent file", it can't find the input fastq files I have in my NGS directory. I would like for my working directory to be /home/lmi/NGS. How should I start GATK to resolve this?

Thank you!

Best Answer


Sign In or Register to comment.