Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GenomicsDBImport file opening error

Dear GATK Team

I run this command for GenomicsDBImport:

```
{params.gatk4} \
GenomicsDBImport \
--sample-name-map {output.samp_map} \
--genomicsdb-workspace-path {output.db_dir} \
-L {params.chr} \
--reader-threads 5 \
-R {params.ref}
```
and parallelize it to a 4 populations (each of 10 samples) and 29 interval, so in total I run ~120 GenomicsDBimport jobs but maximum 50 jobs running at the same time. I got this error in some of the GenomicsDBimport jobs:

```
[TileDB::utils] Error: Cannot write to file '/cluster/work/variant_breeds/db_imp/FV/db_10_FV/10$1$103308737/.__18453f18-a485-4f32-8137-83c9622b62fa47772786001664_1550197922508/MQRankSum.tdb'; File opening error.
terminate called after throwing an instance of 'VariantStorageManagerException'
what(): VariantStorageManagerException exception : Error while writing to TileDB array
```

Does this error related to the heavy file I/O from the program? What is the best way to deal with this? I am working in the Lustre filesystems.

I use GATK version gatk-4.0.6.0.

Regards and many thanks for the helps

Best Answer

  • AdelaideRAdelaideR admin
    Accepted Answer

    Hello @danangcrysnanto - There is a known issue with file mounts and GenomicsImportDB that are not mounted on NFS.

    Take a look at the github issue here to see if there are any helpful suggestions on adjusting how the file has been mounted.

    Also, please provide some error logs if you choose to post to the github issue tracking, that will help them troubleshoot this error.

Answers

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin
    Accepted Answer

    Hello @danangcrysnanto - There is a known issue with file mounts and GenomicsImportDB that are not mounted on NFS.

    Take a look at the github issue here to see if there are any helpful suggestions on adjusting how the file has been mounted.

    Also, please provide some error logs if you choose to post to the github issue tracking, that will help them troubleshoot this error.

  • danangcrysnantodanangcrysnanto EdinburghMember
    Hi @AdelaideR

    Many thanks for the reply. It is indeed a problem with running genomicsdbimport in Lustre filesystem, which after I re-run using local scratch storage attached directly in the compute nodes it runs without any problems...,
Sign In or Register to comment.