Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

COMMON SNPS

i want to get common SNPs b/w file i am getting empty file where i am wrong please suggest
first i made a merged vcf using CombineVariants and then
java -jarGenomeAnalysisTK.jar \ -T SelectVariants -R human_hg19.fa -V merged.vcf -o commonSNps.vcf \ -select Intersection

Answers

  • SheilaSheila Broad InstituteMember, Broadie admin

    @h_asif
    Hi,

    Can you explain more in detail what exactly you want? In your command you only input 1 vcf file, but it looks like you want to take the intersection of two files.

    To get the calls present in both files, you can use --concordance. https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_variantutils_SelectVariants.php#--concordance

    -Sheila

  • h_asifh_asif Member
    edited July 2015

    hi Shella
    i want SNPs common in these 10 files i have 10 samples that i combined using
    java -jar GenomeAnalysisTK.jar \ -T CombineVariants \ -R human_hg19.fa \ --variant 1.vcf.gz \ --variant 2.vcf.gz \ --variant 3.vcf.gz \ --variant 4.vcf.gz --variant 5vcf.gz \ --variant 6.vcf.gz \ --variant 7.vcf.gz \ --variant 8.vcf.gz \ --variant 9.vcf.gz \ --variant 10.vcf.gz \ -o merged.vcf \ -genotypeMergeOptions UNIQUIFY
    then i used mergedvcf file for selectvariant
    java -jar GenomeAnalysisTK.jar \ -T SelectVariants -R human_hg19.fa -V merged.vcf -o commonSNps.vcf \ -select Intersection

  • SheilaSheila Broad InstituteMember, Broadie admin

    @h_asif
    Hi,

    I think the best way to do this is to use JEXL expressions. You can select for het and hom var sites in each sample and select for SNPs. You can use -select 'vc.getGenotype("sample").isHet()' and
    -select 'vc.getGenotype("sample").isHomVar()' for each sample. Have a look at this article for more information: https://www.broadinstitute.org/gatk/guide/article?id=1255

    -Sheila

  • h_asifh_asif Member

    @Sheila
    i guess it will give me homo SNPs from one file and Hetero SNPs from other while all i want SNPs common in my 10 vcf files

  • SheilaSheila Broad InstituteMember, Broadie admin

    @h_asif
    Oh. So, you want the exact same genotype for each sample for each SNP! I see. So, you were on the right track with -select Intersection, but it requires the sets to be annotated in the variant records. It doesn't look like you did that in the Combine Variants command. Can you post a few sites from the merged vcf?

    Thanks,
    Sheila

  • h_asifh_asif Member
    edited July 2015

    @sheila Let's simplify how can I find common SNPs b/w 10 vcf files or I want intersection b/w 10 vcf files. If any SNP is absent in one of the vcf file it must not be in my final output can you suggest how I can do that I don't want to use other tool though I have already use vcftools but I want to give GATK a try as all my pipeline is withit

  • SheilaSheila Broad InstituteMember, Broadie admin

    @h_asif
    Hi,

    Okay. After some clarification from Geraldine, I think you simply want all the sites that are variant in all your samples. A straightforward approach is to use CombineVariants' ability to annotate sets and SelectVariants' ability to extract based the resulting venn components, as described here in the VariantEval method article: https://www.broadinstitute.org/gatk/guide/article?id=48

    I hope this helps!

    -Sheila

  • h_asifh_asif Member

    @Sheila thank you so much and thanks of Geraldine too
    Will give it a try and will discuss output

Sign In or Register to comment.