GenomicsDBImport file opening error

Dear GATK Team

I run this command for GenomicsDBImport:

```
{params.gatk4} \
GenomicsDBImport \
--sample-name-map {output.samp_map} \
--genomicsdb-workspace-path {output.db_dir} \
-L {params.chr} \
--reader-threads 5 \
-R {params.ref}
```
and parallelize it to a 4 populations (each of 10 samples) and 29 interval, so in total I run ~120 GenomicsDBimport jobs but maximum 50 jobs running at the same time. I got this error in some of the GenomicsDBimport jobs:

```
[TileDB::utils] Error: Cannot write to file '/cluster/work/variant_breeds/db_imp/FV/db_10_FV/10$1$103308737/.__18453f18-a485-4f32-8137-83c9622b62fa47772786001664_1550197922508/MQRankSum.tdb'; File opening error.
terminate called after throwing an instance of 'VariantStorageManagerException'
what(): VariantStorageManagerException exception : Error while writing to TileDB array
```

Does this error related to the heavy file I/O from the program? What is the best way to deal with this? I am working in the Lustre filesystems.

I use GATK version gatk-4.0.6.0.

Regards and many thanks for the helps

Best Answer

  • AdelaideRAdelaideR admin
    Accepted Answer

    Hello @danangcrysnanto - There is a known issue with file mounts and GenomicsImportDB that are not mounted on NFS.

    Take a look at the github issue here to see if there are any helpful suggestions on adjusting how the file has been mounted.

    Also, please provide some error logs if you choose to post to the github issue tracking, that will help them troubleshoot this error.

Answers

  • AdelaideRAdelaideR Unconfirmed, Member, Broadie, Moderator admin
    Accepted Answer

    Hello @danangcrysnanto - There is a known issue with file mounts and GenomicsImportDB that are not mounted on NFS.

    Take a look at the github issue here to see if there are any helpful suggestions on adjusting how the file has been mounted.

    Also, please provide some error logs if you choose to post to the github issue tracking, that will help them troubleshoot this error.

  • danangcrysnantodanangcrysnanto EdinburghMember
    Hi @AdelaideR

    Many thanks for the reply. It is indeed a problem with running genomicsdbimport in Lustre filesystem, which after I re-run using local scratch storage attached directly in the compute nodes it runs without any problems...,
Sign In or Register to comment.