If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Discovering singletons with GenotypeGVCFs?


I have several samples that I ran HaplotypeCaller (in normal mode) with that I am looking to discover germline variants from. I read that GenotypeGVCFs isn't good with discovering singletons, and it is likely that there will be many singletons in the samples that I have. Does anyone have a solution to this? I was planning on running GenotypeGVCFs on each sample individually so as to prevent singletons from being lost.


  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited October 13


    The GATK support team will primarily focus on resolving questions about GATK tool errors or abnormal results from the tools. For all other questions, such as this one, we are building a backlog to work through when we have the capacity.

    Please continue to post your questions because we will be mining them for improvements to documentation, resources, and the tools.

    We cannot guarantee a reply, however we ask other community members to help out if you know the answer.

    For more information:

    Post edited by bhanuGandham on
  • gauthiergauthier Member, Broadie, Dev ✭✭✭

    Where did you read that GenotypeGVCFs "isn't good with discovering singletons"? A computational experiment I did years ago shows that there is no loss of singleton sensitivity with increasing cohort size:

    Analysis should never be performed on ungenotyped GVCFs. They contain a lot of low quality variants that are likely false positives and get removed by GenotypeGVCFs, which requires enough evidence that there is less than a 1/1000 chance of a false positive, by default. (About one in every thousand bases in the human genome is variant, so we require more confidence than a 1/1000 chance of FP.) Running GenotypeGVCFs on a multi-sample GVCF will increase the discovery power if there are variants with AC > 1 and variants with AC = 1 should be unaffected.

Sign In or Register to comment.