We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

Homo_sapiens_assembly38.fasta.gz.fai does NOT exist

Hi, I downloaded the GATK bundle folder from the GATK website. It includes the reference genome file Homo_sapiens_assembly38.fasta.gz and its index file Homo_sapiens_assembly38.fasta.fai.

Now when I run "GATK GenotypeGVCFs -R", it says that: A USER ERROR has occurred: Fasta index file Homo_sapiens_assembly38.fasta.gz.fai for reference file Homo_sapiens_assembly38.fasta.gz does not exist.

This is kind of dummy. In this case, should I rename Homo_sapiens_assembly38.fasta.fai to Homo_sapiens_assembly38.fasta.gz.fai, or should I unzip the Homo_sapiens_assembly38.fasta.gz file?

Best regards,
Jie

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited December 2019

    Hi @jiehuang001

    Both should work. The best way to find out is to try it :smile:

  • Yes,

    I tried and found both would work. Of course, I would not gunzip a huge file unless I had to.

    What i am trying to say is: why you guys provide a bundle of files that does not just work fine? Instead, users had to run into error messages and then manually rename a file from .fai to .gz.fai ? Shouldn't GATK be smart enough to know that the abc.fasta.fai file is the index for the abc.fasta.gz file.

    I really wish that I had experienced less pain to make GATK work. I read about $5 GATK pipeline and a bunch of other documents and blog posts. But then again, you guys put a lot of complicated JSON and WDL files and cool ways to run stuff. Why can't you please simply provided a few lines of commands, such as below. To call short SNV, which is why most users are using GATK, we indeed just need the following few commands, correct?
    **1. gatk MarkDuplicates --blah --blah
    2. gatk BaseRecalibrator --blah --blah
    3. gatk ApplyBQSR --blah --blah
    4. gatk HaplotypeCaller --blah --blah
    5. gatk CombineGVCFs --blah --blah
    6. gatk GenotypeGVCFs --blah --blah
    7. gatk VariantRecalibrator --blah --blah
    8. gatk ApplyVQSR --blah --blah **

    Sorry that I sound a little grumpy. I understand that you guys developed something really cool and so vital to everybody and we did not pay a penny. Actually, I would be happy to pay some, if that would reduce my pain...

    Best regards,
    Jie

Sign In or Register to comment.