Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

HaplotypeCaller says dict does not exist, but it does!

JonJon Harvard T.H.Chan School of Public HealthMember

Hi All,

I am running HaplotypeCaller and getting the error:

ERROR MESSAGE: Fasta dict file /net/rcnfs02/srv/export/duraisingh_lab/share_root/data/Plasmodium_knowlesi/jva/PlasmoDB-26_PknowlesiH_Genome_02.dict for reference /net/rcnfs02/srv/export/duraisingh_lab/share_root/data/Plasmodium_knowlesi/jva/PlasmoDB-26_PknowlesiH_Genome_02.fasta does not exist

BUT... the dictionary DOES exist! I made it with CreateSequenceDictionary.jar and it looks OK.

The reference dict and fasta are symbolically linked to the working directory. I did some googling on this but no luck.

Best,

Jon

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Jon
    Hi Jon,

    Can you confirm you are using the latest version of GATK? I think this was a bug that has been fixed in the latest version.

    -Sheila

  • JonJon Harvard T.H.Chan School of Public HealthMember

    I am using 3.4! I'll ask my support team to update to 3.5, or do a local install if they cannot do so quickly. I'll let you know how it goes.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Jon
    Hi Jon,

    Perhaps this is not the bug I was thinking of. Sorry for the confusion. I think the issue is that GATK does not recognize symbolic links. I will test this out and let you know for sure soon.

    -Sheila

  • I am having the same issue of Jon. How did you solve this error? All my fasta files are in the same folder, but HaplotypeCaller can't see them ("fasta file does not exist")

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @andreacontina

    Whats the version of gatk you are using? Please post the exact command you are using and the error msg here. Also please send me a `ls -l' of the directory with the fasta file. This will help us figure out the issue.

    Regards
    Bhanu

  • Hi Bhanu,

    The version that I am using is:

    GATK/3.6-Java-1.8.0_91
    GATK/3.8-0-Java-1.8.0_141

    All my fasta files are in the right work directory:

    -sh-4.2$ pwd
    /work/bunting2016/2018/dupfilter2018/bam2018

    -sh-4.2$ ls -l -t | head -30
    total 34256080
    -rw-r--r--. 1 bunting2016 pabugenome 3276 Oct 17 13:05 s8err18.txt
    -rw-r--r--. 1 bunting2016 pabugenome 0 Oct 17 13:05 s8re18.txt
    -rwxr-xr-x. 1 bunting2016 pabugenome 475 Oct 17 13:03 s8.txt
    -rw-r--r--. 1 bunting2016 pabugenome 81235 Oct 17 13:02 Step7err18.txt
    -rw-r--r--. 1 bunting2016 pabugenome 2761433 Oct 17 13:02 Geospiza_fortis.scaf.noBacterial.fa.fai
    drwxr-xr-x. 2 bunting2016 pabugenome 4096 Oct 17 13:01 SNPCall2018
    -rw-r--r--. 1 bunting2016 pabugenome 0 Oct 17 11:01 Step7res18.txt
    drwxr-xr-x. 2 bunting2016 pabugenome 4096 Oct 17 10:59 SNPCall
    -rw-r--r--. 1 bunting2016 pabugenome 12643452 Oct 17 10:56 Geospiza_fortis.scaf.noBacterial.fa.dict
    -rwxr-xr-x. 1 bunting2016 pabugenome 698 Oct 17 10:54 s7.txt
    -rw-r--r--. 1 bunting2016 pabugenome 1073918729 Oct 17 10:45 Geospiza_fortis.scaf.noBacterial.fa
    -rw-r--r--. 1 bunting2016 pabugenome 246877235 Oct 16 19:09 11S.bam
    -rw-r--r--. 1 bunting2016 pabugenome 205741148 Oct 16 18:15 11JJ.bam
    -rw-r--r--. 1 bunting2016 pabugenome 423225101 Oct 16 17:26 11HB.bam
    -rw-r--r--. 1 bunting2016 pabugenome 441363057 Oct 16 15:49 11BS.bam
    -rw-r--r--. 1 bunting2016 pabugenome 27287415 Oct 16 14:04 10SL.bam
    -rw-r--r--. 1 bunting2016 pabugenome 477708654 Oct 16 13:59 10S.bam
    -rw-r--r--. 1 bunting2016 pabugenome 305879430 Oct 16 12:08 10JJ.bam
    -rw-r--r--. 1 bunting2016 pabugenome 305017630 Oct 16 10:51 10HB.bam
    -rw-r--r--. 1 bunting2016 pabugenome 443188760 Oct 16 09:42 10DF.bam

    This is what I am running (also as a batch job)

    java -jar $EBROOTGATK/GenomeAnalysisTK.jar \
    -T HaplotypeCaller -R Geospiza_fortis.scaf.noBacterial.fa -I SNPCall2018/PABU_2018_mergedT.bam -stand_call_conf 20.0 -o SNPCall2018/PABU2018.vcf --genotyping_mode DISCOVERY

    ERROR MESSAGE: Fasta dict file /work/bunting2016/2018/dupfilter2018/bam2018/Geospiza_fortis.scaf.noBacterial.dict for

    reference /work/bunting2016/2018/dupfilter2018/bam2018/Geospiza_fortis.scaf.noBacterial.fa does not exist.

    Please note: the bam file "PABU_2018_mergedT.bam" is saved in a different folder (SNPCall2018), but I gave the path to it in my command "-I SNPCall2018/PABU_2018_mergedT.bam" and moving everything in the same folder does not help either.

    Any help? Thanks!

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Ana_22

    GATK is looking for Geospiza_fortis.scaf.noBacterial.dict, where as the file in your folder is Geospiza_fortis.scaf.noBacterial.fa.dict. Rename and run again. That should solve the issue.

    Naming convention for fasta and its corresponding dict file is reference.fa and reference.dict

    Hope this helps.

    Regards
    Bhanu

Sign In or Register to comment.