Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Combine Variants from different sample variant files

MUHAMMADSOHAILRAZAMUHAMMADSOHAILRAZA Beijing Institute of Genomics, CASMember ✭✭

Hi ,

Given:
I have two VCF files 1000G SNPs (Bi-allelic) and MySAMPLE variant SNPs (Bi-allelic). Both the files contain non-overlapping samples.

Problem:
I wanna generate a combined VCF file containing all samples (1000G + MYSAMPLE) with the intersecting/overlapping/common sites present in both VCF files.

I tried SelectVariants --concordance option, my command-line was:

Command-line 1:

java -Xmx8g -jar $GATK -T SelectVariants \
-nt 10 \
-R $REF \
-V $MYSAMPLE/VQSR_PHASE2_snp99.5-Combine_Biallelic-MAF-0.01.recode.vcf \
--concordance $VAR1/ALL.WGS.chr.phase3_biallelic-concordMYSAMPLE.vcf \
-o $OUTPUT1/VQSR_PHASE2_snp99.5_Biallelic-MAF-0.01-CONCORD-1KGPnew.vcf \
-selectType SNP \
-restrictAllelesTo BIALLELIC

Command-line 2:

java -Xmx8g -jar $GATK -T SelectVariants \
-nt 10 \
-R $REF \
-V $VAR/ALL.WGS.chr.phase3_biallelic.vcf \
--concordance $MYSAMPLE1/VQSR_PHASE2_snp99.5_Biallelic-MAF-0.01-CONCORD-1KGPnew.vcf \
-o $OUTPUT/ALL.WGS.chr.phase3_biallelic-concordMYSAMPLE.vcf \
-selectType SNP \
-restrictAllelesTo BIALLELIC

Here, I was expecting the same number of variants in the above mentioned two resultant concordant files with the different sample genotypes but it was odd :
ALL.WGS.chr.phase3_biallelic-concordMYSAMPLE.vcf: 7533554
VQSR_PHASE2_snp99.5_Biallelic-MAF-0.01-CONCORD-1KGPnew.vcf: 7533549

and later by combining the VCF file with CombineVariants the variant number further decreased to 7511112, my command-line was:

java -Xmx8g -jar $GATK -T CombineVariants \
-nt 10 \
-R $REF \
-V $VAR/ALL.WGS.chr.phase3_biallelic-concordMYSAMPLE.vcf \
-V $MYSAMPLE/VQSR_PHASE2_snp99.5_Biallelic-MAF-0.01-CONCORD-1KGPnew.vcf \
-genotypeMergeOptions UNIQUIFY \
-o $OUTPUT/ALL_PHASE2_snp99.5_Biallelic-MAF-0.01-COMBINE-1KGP.vcf

Could you please help me in correctly using the command-line to solve my problem.

Thanks

Answers

Sign In or Register to comment.