Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Unsure if GenomicsDBImport worked properly

sp580sp580 GermanyMember


I run three instances (chr) of the tool and they were all were done within 1.5 days. The log indicates that the importing was successful, and since the tool is called GenomicsDBImport, it seems that that is all I need to know to make sure the execution done properly. However, I am not sure if my assumption is correct. Maybe I am being too paranoid, but I want to make sure, here is the log of one of the runs:

Using GATK jar path/to/gatk-
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar path/to/gatk- GenomicsDBImport --sample-name-map path/to/sample_map_with_path.txt --genomicsdb-workspace-path db_10 -L 10
16:38:09.910 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/fb4/palma-vera/FBN_HOME/Tools/gatk-!/com/intel/gkl/native/libgkl_compression.so
16:38:35.128 INFO  GenomicsDBImport - ------------------------------------------------------------
16:38:35.128 INFO  GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.0.6.0
16:38:35.128 INFO  GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
16:38:35.131 INFO  GenomicsDBImport - Executing as [email protected] on Linux v4.4.143-94.47-default amd64
16:38:35.131 INFO  GenomicsDBImport - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_181-b13
16:38:35.132 INFO  GenomicsDBImport - Start Date/Time: 13. September 2018 16:38:09 MESZ
16:38:35.132 INFO  GenomicsDBImport - ------------------------------------------------------------
16:38:35.132 INFO  GenomicsDBImport - ------------------------------------------------------------
16:38:35.133 INFO  GenomicsDBImport - HTSJDK Version: 2.16.0
16:38:35.133 INFO  GenomicsDBImport - Picard Version: 2.18.7
16:38:35.133 INFO  GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:38:35.133 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:38:35.133 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:38:35.133 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:38:35.134 INFO  GenomicsDBImport - Deflater: IntelDeflater
16:38:35.134 INFO  GenomicsDBImport - Inflater: IntelInflater
16:38:35.134 INFO  GenomicsDBImport - GCS max retries/reopens: 20
16:38:35.134 INFO  GenomicsDBImport - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
16:38:35.134 INFO  GenomicsDBImport - Initializing engine
16:38:36.378 INFO  IntervalArgumentCollection - Processing 130694993 bp from intervals
16:38:36.384 INFO  GenomicsDBImport - Done initializing engine
Created workspace /projekte/I2-SOS-FERT/06_GenomicsDBImport/my_database/db_10
16:38:36.650 INFO  GenomicsDBImport - Vid Map JSON file will be written to db_10/vidmap.json
16:38:36.650 INFO  GenomicsDBImport - Callset Map JSON file will be written to db_10/callset.json
16:38:36.650 INFO  GenomicsDBImport - Complete VCF Header will be written to db_10/vcfheader.vcf
16:38:36.650 INFO  GenomicsDBImport - Importing to array - db_10/genomicsdb_array
16:38:36.650 INFO  ProgressMeter - Starting traversal
16:38:36.651 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Batches Processed   Batches/Minute
16:38:44.245 INFO  GenomicsDBImport - Importing batch 1 with 60 samples
03:40:16.466 INFO  ProgressMeter -                 10:1           2101.7                     1              0.0
03:40:16.467 INFO  GenomicsDBImport - Done importing batch 1/1
03:40:16.469 INFO  ProgressMeter -                 10:1           2101.7                     1              0.0
03:40:16.469 INFO  ProgressMeter - Traversal complete. Processed 1 total batches in 2101.7 minutes.
03:40:16.469 INFO  GenomicsDBImport - Import completed!
03:40:16.469 INFO  GenomicsDBImport - Shutting down engine
[15. September 2018 03:40:16 MESZ] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 2,102.11 minutes.
Tool returned:

One particular point I am unsure about is that the tool prints out, for example, Callset Map JSON file will be written to db_10/callset.json , but then there is no follow up to that (i.e. that the writing has been done).

Essentially my question is whether or not this output corresponds to a successful execution of GenomicsDBImport.

Thanks in advance!


Sign In or Register to comment.