Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Running GenomicsDBImport: stuck on 'INFO GenomicsDBImport - Importing batch 1 with 62 samples'

Hi,

I am running GenomicsDBImport on 62 human exome data and somehow the process stuck in the very beginning of the process. Please see below message:

16:24:19.150 INFO  ProgressMeter - Starting traversal
16:24:19.150 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Batches Processed   Batches/Minute
16:24:19.378 INFO  GenomicsDBImport - Starting batch input file preload
16:24:36.900 INFO  GenomicsDBImport - Finished batch preload
16:24:36.900 INFO  GenomicsDBImport - Importing batch 1 with 62 samples

It stucks right here for several hours.

Here is the command I used:

java -jar $GATK GenomicsDBImport -R $hg19 \
     --sampleNameMap sample.map
     -L chr1 \
     --genomicsdb-workspace-path $output

I wonder if this is normal. Thank you for your help in advance!

Masaki

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @masaki396

    Are you still facing this issue? If yes, would you please show me what your sample.map looks like? Please note: Using vcfs with incompatible headers may result in silent data corruption.

  • wbsimeywbsimey California Academy of SciencesMember ✭✭

    I have the very same experience, but I hit ENTER and it instantly completed. I had been waiting for almost two weeks before I decided to hit ENTER. Now I hit ENTER right away to preload this user input and this works for me.
    I am using v4.1.2.0 via conda. I have not tried with my Docker instance. I noticed the latest version is 4.1.3.0. I will update and try again.

    my command was:

    gatk --java-options "-Xmx200g -Xms200g" \
        GenomicsDBImport \
        --genomicsdb-workspace-path ../Tse_scaff_4_database \
        -L scaffold_4 \
        --sample-name-map ../gVCFs/scaffold_4_sample_names_map.txt \
        --tmp-dir=/data/tmp \
        --reader-threads 8
    
  • wbsimeywbsimey California Academy of SciencesMember ✭✭

    I reran the same script as above with v4.1.3.0, but with a different interval, and it completed on its own without any user input.

Sign In or Register to comment.