We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

The genotypes in combined VCF generated by combineVariants are different from original VCFs

Member
edited October 2014

Hi GATK team,

I was working on generating a combined VCF using 150+ VCFs (building the sort of cohort). The purpose of it is to calculate variants cohort frequency. But I found the genotype is messed up in the combined VCF. Here is my cmd line:

java -jar /GATK/GenomeAnalysisTK-2.7-4/GenomeAnalysisTK.jar -R refernce.fasta \

-T CombineVariants \

--variant sample1.vcf \

--variant sample2.vcf \

-o combined.vcf

Here is one example of one variant/ position in the combined VCF. The record is very long in the combined VCF, I just grabbed the related columns here.

1 22082967 rs35545280 CAAA CAA,C,CA,CAAAA 76.73 PASS AC=109,6,104,6;AF=0.368,0.020,0.351,0.020;AN=296;DB;DP=9869;GC=48.13;MQ0=0;PercentNBaseSolid=0.0000;RU=A;STR;set=filterInvariant-filterInvariant2-filterInvariant3… GT:DP:GQ

In this record, sample 1 has this variant and it shows as "0/3:46:99",

but in the sample1.vcf, it is listed as

And you can see that the genotype in combined VCF for sample 1 is 0/3, but in its original its is 1/1 which is homozygous. So when I calculate the cohort frequency, I'm confused on matching genotype of this variant for sample 1.

To give you more idea, I listed another sample of same variant in combined.VCF and its record in sample2.VCF.

In the combined.VCF, sample 2 shows as "0/3:91:99".

In the sample2.VCF, the record is:

Where you can see the genotype is 1/2, but in the combined VCF, it shows as "0/3".

Please advise me if I should use any parameter in the cmd line to solve this problem.

Thank you.
Linda

Tagged:

• Member

Hi Sheila,

There is no error msg. But I can give you the full record in the combined.VCF. Let me know if this works for you.

1 22082967 rs35545280 CAAA CAA,C,CA,CAAAA 76.73 PASS AC=109,6,104,6;AF=0.368,0.020,0.351,0.020;AN=296;DB;DP=9869;GC=48.13;MQ0=0;PercentNBaseSolid=0.0000;RU=A;STR;set=filterInvariant-filterInvariant2-filterInvariant3-filterInvariant4-variant5-variant6-filterInvariant8-variant9-variant10-filterInvariant11-variant12-filterInvariant14-filterInvariant15-filterInvariant16-variant17-filterInvariant18-variant19-filterInvariant20-filterInvariant21-variant22-filterInvariant23-variant24-filterInvariant25-variant27-variant28-variant29-filterInvariant30-variant31-variant32-variant33-variant34-variant35-variant36-variant37-variant38-variant39-variant40-variant41-filterInvariant42-variant44-variant45-variant46-variant47-variant48-variant49-variant50-variant51-variant52-variant53-variant54-variant55-variant56-variant57-variant58-variant60-variant61-variant62-variant63-variant64-variant65-variant66-variant67-variant68-variant69-filterInvariant70-variant71-variant72-variant73-variant74-variant75-variant76-variant77-variant78-variant79-variant80-variant81-variant82-variant83-variant84-variant85-variant87-variant88-variant89-variant90-variant91-variant92-variant93-variant95-variant96-variant97-variant98-variant99-variant100-variant101-variant102-variant103-variant104-variant105-variant106-variant107-variant108-variant109-variant110-variant111-variant112-variant113-filterInvariant114-variant115-variant116-filterInvariant118-variant119-variant120-variant121-filterInvariant122-filterInvariant123-variant124-variant125-filterInvariant126-variant127-variant128-filterInvariant129-variant130-filterInvariant131-variant132-variant134-variant135-variant136-filterInvariant137-variant138-filterInvariant139-variant140-variant141-filterInvariant142-variant143-variant144-variant145-variant146-variant147-variant148-variant149-variant150-variant151-variant152-variant153-variant154-variant155-variant156-variant157 GT:DP:GQ 3/1:61:99 0/3:57:99 0/3:78:99 0/3:88:99 3/1:106:99 3/1:115:99 3/1:105:99 3/1:100:99 3/1:102:99 3/1:98:99 3/1:111:99 0/1:79:99 3/1:110:99 0/1:90:60 0/3:31:54 0/1:28:26 3/1:101:99 3/1:91:99 4/1:63:76 0/1:34:33 3/1:32:89 3/1:62:99 0/3:75:99 1/1:57:5 0/1:51:99 3/1:71:99 3/1:64:99 3/1:58:99 3/1:68:99 3/1:74:99 3/1:47:99 0/3:46:99 3/1:39:99 3/1:46:99 ./. 0/3:53:75 1/1:81:40 3/1:86:99 0/1:60:99 ./. 0/3:66:99 0/3:65:99 3/1:58:99 0/1:80:1 0/1:89:8 0/3:91:99 3/1:112:99 1/1:76:19 0/3:88:99 0/3:86:99 3/1:86:99 3/1:100:99 0/1:51:99 3/1:60:99 3/1:56:99 0/3:58:99 0/1:65:99 3/1:70:99 3/1:54:86 3/1:56:99 0/3:45:99 3/1:60:99 3/1:55:99 0/1:50:99 0/1:71:99 4/1:69:86 0/3:71:99 0/3:76:99 0/3:56:99 0/3:72:99 3/1:59:99 0/3:79:99 0/4:66:99 3/1:71:99 3/1:58:99 0/3:64:99 2/1:70:99 3/1:53:79 3/1:69:99 2/1:70:99 ./. 1/1:62:74 ./. 0/3:66:99 0/3:53:99 0/3:86:99 3/1:65:99 3/1:70:99 0/1:75:99 0/3:40:99 0/1:79:99 3/1:66:83 3/1:73:99 3/1:72:99 4/1:66:99 3/1:59:96 0/3:70:99 2/1:63:90 0/3:66:99 4/3:38:99 3/1:73:99 3/1:55:99 3/1:71:99 3/1:52:99 ./. 2/1:53:99 3/1:56:99 0/3:52:99 0/3:74:99 0/3:80:99 0/1:77:99 0/1:59:21 0/3:67:99 0/3:60:99 3/1:80:99 0/3:75:99 1/1:53:69 3/1:74:87 0/1:57:99 3/1:57:99 0/1:78:44 0/3:52:99 0/1:54:99 ./. 0/1:61:99 0/1:38:99 ./. ./. 0/1:49:99 3/1:60:99 0/1:51:99 0/3:68:99 3/1:53:81 0/3:68:99 3/1:50:99 0/3:55:99 0/1:63:99 0/1:43:90 3/1:69:99 3/1:63:99 0/3:77:99 3/3:63:38 ./. 3/1:57:99 0/4:76:99 3/1:78:99 3/1:69:99 3/1:85:99 0/1:61:59 0/2:46:99 0/3:61:99 0/1:60:99 0/1:49:99 0/1:53:99 0/2:47:99 3/1:49:99 3/1:90:99

@lindakjcao‌

Hi Linda,

We want to be able to replicate the error here so we can try to figure out what is going on.

I see you have one particular record where an error occurs, and the error is in sample 1.

If you can submit a snippet of sample 1 vcf around that record plus a few other sample vcfs around that record, we can replicate it here.

I hope this makes sense.

Thanks,
Sheila

• Member

Hi Shelia,

The errors are in many more than two samples. I just put two samples here. However the record that involved in sample 1 and sample 2 VCFs are listed in the original post. Please check it out.

Thanks,
Linda