Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GenotypeGVCFs: --includeNonVariantSites disappeared?

TestorTestor GermanyMember

Hi,
I just wanted to use the GenotypeGVCFS tool to genotype some gvcfs at known variant sites and am also quite interested if my samples are reference at these positions or if the sites are not covered. The old GATK 3.7 version had the option --includeNonVariantSites, which is not supported by GATK4... Do you have some hints or a workaround? Currently, I'm rolling back to v3.7 which might lead to difficulties later.
Thanks for your help
Stefan

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @Testor,

    The old GATK 3.7 version had the option --includeNonVariantSites, which is not supported by GATK4... Do you have some hints or a workaround?

    HaplotypeCaller itself has the -ERC BP_RESOLUTION mode for you to cover every site. You can feed such results into GenotypeGVCFs, but as you say I believe the site will not have a record unless a sample is variant.

    One solution that I just learned about and tested out yesterday from those on the team is to check for specific alleles at a specific site with the --genotyping-mode GENOTYPE_GIVEN_ALLELES. You will need to provide the alleles of interest to HaplotypeCaller with --alleles xyz.vcf. This could suite your need to check for ref alleles at a particular site in particular samples.

  • sryan6sryan6 University of Notre DameMember
    edited May 2018

    Hi,

    Is there another way (other than --includeNonVariantSites which is no longer supported) to output invariant sites using the newest version of GATK's GenotypeGVCFs? I want all of them so using the solution proposed by shlee would be too cumbersome.

    Well technically I have RAD data and so really only want invariant/variant sites for those regions (where my reads mapped to the reference), but I am unaware of such an option. Thus I figured I would just call all sites and filter out those with no coverage downstream.

    Thanks,
    Sean

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @sryan6
    Hi Sean,

    Unfortunately, there is no other way I can think of. But, it looks like the priority for this to be "fixed" is higher after users have posted on the issue ticket here. Perhaps adding your vote as well will help it get in the code faster :smiley:

    -Sheila

Sign In or Register to comment.