We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

How can I get a common variant of three samples from multi-sample VCF after joint genotyping?

JW_LeeJW_Lee South koreaMember

Hi. I’m studying about sequencing data analysis followed GATK Best practices for Germline SNP & Indel Discovery. Through the series of analysis, finally I get multi-sample VCF file from joint genotyping. After this, I want to get common variants from only three sample. I’ve conducted “Select variant” and using “—sample_name” argument and i get has three sample from vcf file. but this VCF file has not three sample variants, it contains all variant from multi-sample. so I want ask you any methods to get the common variants of three samples from multi-sample. Thank you


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    Sorry for the delay. Are you looking for sites where the three samples are variant, or sites where the three samples have the exact same genotype?


  • javisjavis Member

    @Sheila I'm looking for sites where the three samples are variant,how can I use “SelectVariants” to get the common variant

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @javis

    Try gatk SelectVariants \ -V gs://gatk-tutorials/workshop_1702/variant_discovery/data/inputVcfs/trio.vcf.gz \ -select 'vc.getGenotype("NA12878").isHomRef()' -select 'vc.getGenotype("NA12877").isHomRef()' -select 'vc.getGenotype("NA12882").isHomRef()' \ --invertSelect true \ -O /home/jupyter-user/motherSNP.vcf.gz

Sign In or Register to comment.