Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

"Failed to create reader" error in GenomicsDBImport

I ran GenomicsDBImport and got the error below. Perhaps it is worth noting that I'm running this in Nextflow (nextflow.io), because I didn't have this problem outside of Nextflow.

15:54:24.801 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/olavur/miniconda3/envs/exolink/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
15:54:26.895 INFO  GenomicsDBImport - ------------------------------------------------------------
15:54:26.896 INFO  GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.1.0.0
15:54:26.896 INFO  GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
15:54:26.897 INFO  GenomicsDBImport - Executing as [email protected] on Linux v4.4.0-101-generic amd64
15:54:26.897 INFO  GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_192-b01
15:54:26.898 INFO  GenomicsDBImport - Start Date/Time: 02 May 2019 15:54:24 WEST
15:54:26.898 INFO  GenomicsDBImport - ------------------------------------------------------------
15:54:26.898 INFO  GenomicsDBImport - ------------------------------------------------------------
15:54:26.899 INFO  GenomicsDBImport - HTSJDK Version: 2.18.2
15:54:26.899 INFO  GenomicsDBImport - Picard Version: 2.18.25
15:54:26.899 INFO  GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:54:26.899 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:54:26.899 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:54:26.900 INFO  GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:54:26.900 INFO  GenomicsDBImport - Deflater: IntelDeflater
15:54:26.900 INFO  GenomicsDBImport - Inflater: IntelInflater
15:54:26.900 INFO  GenomicsDBImport - GCS max retries/reopens: 20
15:54:26.900 INFO  GenomicsDBImport - Requester pays: disabled
15:54:26.900 INFO  GenomicsDBImport - Initializing engine
15:54:27.029 INFO  GenomicsDBImport - Shutting down engine
[02 May 2019 15:54:27 WEST] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=205800865792
***********************************************************************

A USER ERROR has occurred: Failed to create reader from file://data/results/gvcf/FN000119.gvcf

***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Using GATK jar /home/olavur/miniconda3/envs/exolink/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx200g -Xms200g -jar /home/olavur/miniconda3/envs/exolink/share/gatk4-4.1.0.0-0/gatk-package-4.1.0.0-local.jar GenomicsDBImport -V data/results/gvcf/FN000119.gvcf -V data/results/gvcf/FN000103.gvcf -V data/results/gvcf/FN000105.gvcf -L resources/sureselect_human_all_exon_v6_utr_grch38/S07604624_Padded.bed --genomicsdb-workspace-path genomicsdb/run --merge-input-intervals --tmp-dir=tmp

The command I ran was:

export TILEDB_DISABLE_FILE_LOCKING=1
gatk GenomicsDBImport         -V data/results/gvcf/FN000119.gvcf -V data/results/gvcf/FN000103.gvcf -V data/results/gvcf/FN000105.gvcf         -L resources/sureselect_human_all_exon_v6_utr_grch38/S07604624_Padded.bed         --genomicsdb-workspace-path "genomicsdb/run"         --merge-input-intervals         --tmp-dir=tmp         --java-options "-Xmx200g -Xms200g"

The command that produced the GVCF in question is:

gatk HaplotypeCaller          -I recalibrated.bam         -O "FN000119.gvcf"         -R resources/reference_10x_Genomics/refdata-GRCh38-2.1.0/fasta/genome.fa         -L resources/sureselect_human_all_exon_v6_utr_grch38/S07604624_Padded.bed         --dbsnp resources/gatk_bundle/Homo_sapiens_assembly38.dbsnp138/Homo_sapiens_assembly38.dbsnp138.vcf         -ERC GVCF         --create-output-variant-index         --annotation MappingQualityRankSumTest         --annotation QualByDepth         --annotation ReadPosRankSumTest         --annotation RMSMappingQuality         --annotation FisherStrand         --annotation Coverage         --verbosity INFO         --tmp-dir=tmp         --java-options "-Xmx100g -Xms100g"

Answers

  • olavurolavur Member

    By running a SelectVariants command, I figured out that the problem was that the GVCF was missing an index file (HaplotypeCaller of course produced this file originally). It would be great if GenomicsDBImport could say that the index is missing, as SelectVariants did. On the other hand, I guess when you get a "Failed to create reader" error, you should automatically do some sort of test like SelectVariants.

  • evetcevetc Member
    Hello Olavur,

    Thank you for your question and subsequent answer.
    I have recieved this same error but I was wondering how do I make an index for my gvcfs? Are the index files ".g.cf.gz.tai"? If so, then I have these and it must be another problem causing this issue.

    Thanks!
  • olavurolavur Member

    @evetc I think the correct extension for GVCFs is .g.vcf. If you compress the GVCF then your extension is probably .g.vcf.gz. The way I've used GATK, HaplotypeCaller outputs an uncompressed VCF, and produces the index file itself, giving you a .g.vcf and a .g.vcf.idx file. When I compress a normal VCF file to .vcf.gz I usually use tabix to index it, giving me a .vcf.tbi. Don't know if that answered your question. My advice is to not compress the GVCF, and check that HaplotypeCaller produces the idx file.

  • evetcevetc Member
    Hello @olavur - thank you so much for your speedy response!

    I am very new to GATK so thank you for your help.
    Ok, perhaps working on a compressed file is confusing matters. I will have a run through my test samples and make sure I am not creating .g.vcf.gz from HaplotypeCaller and see if it creates the idx file.

    Another question is that GenomicsDBImport seems to work only for particular intervals. Do you know if you can simply run it on the whole genome as the CombineGVCFs tool does?

    Thank you so much
  • olavurolavur Member

    @evetc You can just supply the entire genome. Maybe that can be written as -L chr1;chr2;chr3;... and so on, not sure about the syntax.

  • evetcevetc Member
    @olavur - I have just re-run my haplotype caller without compressing it and its worked and has an index! :D will now work on using the whole genome.

    Thank you so much!
  • LaboBernardLaboBernard Ri-MUHCMember

    Hi,
    I am having the same problem. I created the idx file using IGV and IndexFeatureFile but kept getting the same error. I don't know what to do anymore.

    Thank you

    java -jar gatk4/gatk-package-4.0.12.0-local.jar GenomicsDBImport --genomicsdb-workspace-path wes_raw/A-negative --intervals chr21 --batch-size 3 --sample-name-map A-negative/cohort_sample_map2.txt --reader-threads 5
    17:28:32.405 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/labobernard/gatk4/gatk-package-4.0.12.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    17:28:34.053 INFO GenomicsDBImport - ------------------------------------------------------------
    17:28:34.053 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.0.12.0
    17:28:34.053 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
    17:28:34.053 INFO GenomicsDBImport - Executing as [email protected] on Linux v4.15.0-54-generic amd64
    17:28:34.054 INFO GenomicsDBImport - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_201-b09
    17:28:34.054 INFO GenomicsDBImport - Start Date/Time: July 15, 2019 5:28:32 PM EDT
    17:28:34.054 INFO GenomicsDBImport - ------------------------------------------------------------
    17:28:34.054 INFO GenomicsDBImport - ------------------------------------------------------------
    17:28:34.055 INFO GenomicsDBImport - HTSJDK Version: 2.18.1
    17:28:34.055 INFO GenomicsDBImport - Picard Version: 2.18.16
    17:28:34.055 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    17:28:34.055 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    17:28:34.055 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    17:28:34.055 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    17:28:34.056 INFO GenomicsDBImport - Deflater: IntelDeflater
    17:28:34.056 INFO GenomicsDBImport - Inflater: IntelInflater
    17:28:34.056 INFO GenomicsDBImport - GCS max retries/reopens: 20
    17:28:34.056 INFO GenomicsDBImport - Requester pays: disabled
    17:28:34.056 INFO GenomicsDBImport - Initializing engine
    17:28:34.073 INFO GenomicsDBImport - Shutting down engine
    [July 15, 2019 5:28:34 PM EDT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.03 minutes.
    Runtime.totalMemory()=364380160


    A USER ERROR has occurred: Failed to create reader from file:///ext.vcf


    Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

Sign In or Register to comment.