We've moved!
You can find our new documentation site and support forum for posting questions here.

What are hist.bin files?

I have a PoN file with a "hist.bin" suffix. I cannot find information on what tools this file has been created with, and if it is possible to convert the data back to VCF format, so I could use it with GATK4/Mutect2.

I understand this file can be used with maf_pon_filter WDL, but that is not part of the somatic variant calling workflow for Mutect2 that I would need to use. Can you help me out?


  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    @registered_user We are taking a look and will get back to you with options to take care of this!

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    The "hist.bin" suffix looks like the required suffix for the PoN file to be used with the maf_pon_filter method. This can be seen in the Description of the method.

    This filtering step however is after the somatic variant calling step. If I am not mistaken, I believe that you would like to run Mutect2 somatic variant calling workflow. For this, Broad suggests as Best Practices, that the user create their own Panel of Normals because it should be based on normals that are technically similar to your tumor samples and also sequenced on the same platform. There are more details on the best practices for Broad Mutation Calling as well as on generating your PoN here. Additionally, here is a tutorial on somatic mutation calls using GATK4 Mutect2 that you might find useful in running your Mutect workflow.

    Once you have used Mutect2-tumor-only mode to generate your somatic PoN, you can run Mutect2 somatic mode with the PoN you generated as a filter to get a raw unfiltered somatic callset. (More details on the PoN are also available in the tutorial I linked above). I believe it is at this point, you can perform filtering with the .hist.bin PoN you have with the maf_pon_filter method.

    I hope this helps direct you in the right direction but please feel free to reply back with any clarifications or follow-up questions.

  • @SChaluvadi thank you for the reply. My problem is that I would need to use the same PoN that was used in a previous project, but I only have access to this "hist.bin" token PoN file. I'm familiar with the Mutect2 somatic workflow but no way of using this PoN data with it. I was hoping there would be some way to convert the data to another format that I could work with, but it is starting to seem like it is not possible.

  • bshifawbshifaw Member, Broadie, Moderator admin
    edited November 2018

    Hi @registered_user ,

    You may want to try posting your question regarding the conversion on biostars and seqanswers.

    Post edited by bshifaw on
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @registered_user,

    The method here that you refer to, the maf_pon_filter method, appears to be a Broad Cancer group method. Can you tell us whether you were given the "hist.bin token PoN" file from a collaborator or gained access to the file through a shared or public FireCloud workspace? It would be helpful to know the origin of the file to trace back who to ask about it. Thanks.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @registered_user, I will direct-message you. Let us consider this thread closed as CGA methods are out of scope for the GATK team.

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin

    @registered_user As shlee has mentioned, we will now be closing this thread but if you have any other questions for the GATK team, you can always reply back!

Sign In or Register to comment.