Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.

Force output of certain regions in GenotypeGVCFs

APredeusAPredeus Saint Petersburg, RussiaMember

Hello all,

It is a described problem that when you get your calls in the form of VCF file after GenotypeGVCFs, some of the medically relevant variants may be missing and it's impossible to tell whether they are not reported due to the position being a homozygous reference, or due to low coverage or another sequencing issue.

Is there an option to force certain regions to be output, e.g. from an external BED file etc? Would such an option be a useful addition to this tool?

Thank you in advance.

Best Answer

Answers

  • APredeusAPredeus Saint Petersburg, RussiaMember

    Hello again,

    just wanted to make sure the question is not overlooked. Any input would be appreciated - e.g. if you think this option would not be useful, etc. Also would be happy to answer any questions.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi, this can already be done using a combination of -allSites and -L, which can take an interval list, bed or vcf file.

  • APredeusAPredeus Saint Petersburg, RussiaMember

    Dear Geraldine, thank you for your answer, and sorry for getting back to you so late.

    The options you suggest would result in the forced output of the interval list variants ONLY.

    What I was looking for, however, is a normal filtered VCF PLUS the forced regions of interest.

    I understand that this could be done by running the program twice and then integrating the VCF files, but it seems that a singular option would be more streamlined. We'd consider a pull request into the development version if that is possible.

    Thank you in advance.

  • SheilaSheila Broad InstituteMember, Broadie admin

    @APredeus
    Hi,

    I think using -allSites as Geraldine suggested above will give you what you want.

    -Sheila

Sign In or Register to comment.