Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Rough file size for a BP_RESOLUTION GVCF on a whole genome

JverlouwJverlouw Erasmus MC, RotterdamMember

Hello,

Does anyone know a rough estimate of the file size of a gvcf produced at BP_RESOLUTION by the HaplotypeCallerfor a whole genome sequencing experiment. Perhaps a rather simple question, but i cannot find it elsewhere on the forum or other places like seqanswers.

Thanks in advance,

Best Answer

Answers

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭
    edited July 2015

    Human? How many annotations? Compressed? I suggest you run on a fragment that you know have an average SNP density and which is much larger than the size of the metadata lines and multiply/extrapolate.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @Jverlouw
    Hi,

    Yes, as Tommy suggested, the best thing to do is test out a small portion. I tried on 1,000,000 bases, and the BP_RESOLUTION file is 64 MB. So, for the whole genome, the BP_RESOLUTION file should be around 2TB.

    I hope I did the math right and that this makes sense! :smile:

    -Sheila

  • JverlouwJverlouw Erasmus MC, RotterdamMember

    @ Tommycarstensen:
    Should indeed have given that information! It would be a human genome, no annotations, no compression. Due to time and computing constraints we don't really have the time to test like that, but it is a very good idea!

    @Sheila:
    Many thanks! That is actually a lot smaller than we initially thought (made a safe bet at 1 TB).

Sign In or Register to comment.