Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Confusion over newer Mutect2 tutorial,

sabaferdoussabaferdous Cancer Research UK Member

Dear Mutect team,

At the start of this link it says that this tutorial had been deprecatedhttps://software.broadinstitute.org/gatk/documentation/article?id=11136

while the newer documentation of version 4.1.2.0 here (
https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_hellbender_tools_walkers_mutect_Mutect2.php)

does not recommend GetPileUpSummaries and Calculate contamination steps. This gives an impression that these steps are not needed in version 4.1.2.0. But when I proceed to the FilterMutectCalls (4.1.2.0) programs, it still requires contamination table.

Can you please help me by clarifying that do we still need to perform Step 3 from the old tutorial (4.1.0.0) while using 4.1.2.0?

Best regards,

Best Answer

Answers

  • bhaasbhaas Broad InstituteMember, Broadie

    Great! Why not update the original tutorial page? Having to traverse the gatk documentation labyrinth to get a clear comprehensive picture is a bit of a challenge.

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @bhaas

    we are looking into this and fix it shortly.

  • RishabhRishabh IndiaMember
    hello @bhanuGandham
    i m working on hg19 sample data , can u tell me as to what file should i take in -V option .As i should generate vcf.gz from multiple bam files through Mutect2 or should i take annotated vcf file for hg19 from online servers.
    Thankyou.
  • bshifawbshifaw Member, Broadie, Moderator admin

    @Rishabh ,

    There doesn't seem to be a -V option for Mutect2, what tool are you referring to? It might help to review the tool documentation for the tool you are trying to use. Also review the new tutorial for the mutect2 workflow (How to) Call somatic mutations using GATK4 Mutect2.

  • RishabhRishabh IndiaMember
    @bshifaw ,
    thank u for response. I am using the latest manual only.
    I am posting the step in which i am a bit confused as to what should i exactly take in -V option ,as my sample data on which i am working is Hg19.
    gatk GetPileupSummaries \
    -I tumor.bam \
    -V chr17_small_exac_common_3_grch38.vcf.gz \
    -L chr17_small_exac_common_3_grch38.vcf.gz \
    -O getpileupsummaries.table

    Thankyou.
  • bshifawbshifaw Member, Broadie, Moderator admin

    -V is reserved forA VCF file containing variants and allele frequencies

    The depreciated document has a bit more descriptive explanation.

    Use a population germline resource containing only common biallelic variants, e.g. subset by using SelectVariants --restrict-alleles-to BIALLELIC, as well as population AF allele frequencies in the INFO field. The tool tabulates read counts that support reference, alternate and other alleles for the sites in the resource.

    You should be able to use small_exac_common_3.vcf and small_exac_common_3.vcf.idx located in the following google bucket: https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-b37/

  • RishabhRishabh IndiaMember
    thankyou
    Can you please tell me the input file you took to generate this vcf file
  • bshifawbshifaw Member, Broadie, Moderator admin

    @Rishabh

    Review the Call somatic mutations using GATK4 Mutect2 (Deprecated). It has many of your questions listed there.

    [4] The WDL script mutect_resources.wdl takes a large gnomAD VCF or other typical cohort VCF and from it prepares both a simplified germline resource for use in section 1 and a common biallelic variants resource for use in section 3. The script first generates a sites-only VCF and in the process removes all extraneous annotations except for AF allele frequencies. We recommend this simplification as the unburdened VCF allows Mutect2 to run much more efficiently. To generate the common biallelic variants resource, the script then selects the biallelic sites from the sites-only VCF.

  • RishabhRishabh IndiaMember
    @bshifaw
    thankyou for your reply
    Looked in the document Call somatic mutations using GATK4 Mutect2 (Deprecated) but dint found any command for generating this file.
Sign In or Register to comment.