Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How can I get a common variant of three samples from multi-sample VCF after joint genotyping?

JW_LeeJW_Lee South koreaMember

Hi. I’m studying about sequencing data analysis followed GATK Best practices for Germline SNP & Indel Discovery. Through the series of analysis, finally I get multi-sample VCF file from joint genotyping. After this, I want to get common variants from only three sample. I’ve conducted “Select variant” and using “—sample_name” argument and i get has three sample from vcf file. but this VCF file has not three sample variants, it contains all variant from multi-sample. so I want ask you any methods to get the common variants of three samples from multi-sample. Thank you

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @JW_Lee
    Hi,

    Sorry for the delay. Are you looking for sites where the three samples are variant, or sites where the three samples have the exact same genotype?

    -Sheila

  • javisjavis Member

    @Sheila I'm looking for sites where the three samples are variant,how can I use “SelectVariants” to get the common variant

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @javis

    Try gatk SelectVariants \ -V gs://gatk-tutorials/workshop_1702/variant_discovery/data/inputVcfs/trio.vcf.gz \ -select 'vc.getGenotype("NA12878").isHomRef()' -select 'vc.getGenotype("NA12877").isHomRef()' -select 'vc.getGenotype("NA12882").isHomRef()' \ --invertSelect true \ -O /home/jupyter-user/motherSNP.vcf.gz

Sign In or Register to comment.