Questions about the GATK hg38 resource bundle reference

jejacobs23jejacobs23 Portland, ORMember

I am using the GATK hg38 resource bundle reference in my workflow and I was curious if there is any documentation or literature on how the contigs were chosen. Specifically, were did all the decoy contigs come from? A second part to my question is how decoy contigs are managed in a variant calling workflow. Are they ignored by Mutect2 or do you have to filter them out?


  • AdelaideRAdelaideR Member admin

    Hi @jejacobs23

    A detailed description of what is in the resource bundle can be found here

    These resources are provided as a starting point, but it is up to each individual researcher to determine if these are the best resources for their analysis.

  • SChaluvadiSChaluvadi Member, Broadie, Moderator admin
    edited May 2019


    As far as decoy contigs, you can use MateOnSameContigOrNoMappedMateReadFilter to filter out alternate contigs should you choose to do so. Mutect2 disables this filter by default.

  • jejacobs23jejacobs23 Portland, ORMember

    Thank you @AdelaideR for your response

    I was more curious as to how the decoy contigs were chosen. There are quite a lot of them in the GATK hg38 build as compared to other versions of hg38 that I've looked at. I didn't know if there was a specific criteria that was used to include decoy contigs.

  • jejacobs23jejacobs23 Portland, ORMember

    Thank you @SChaluvadi

    I'm concerned that if I use the MateOnSameContigOrNoMappedMateReadFilter, I will loose variants that are mapped to alternate contigs. After giving it some thought, I was considering using my interval list to exclude decoy contigs while retaining the other alternate contigs. Do you think that is an appropriate way of dealing with the issue?

