We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

CombineVariants: GT field is not updated when merging variants with different ALT alleles

sbahetisbaheti Member
edited September 2013 in Ask the GATK team


my two VCF files have different alternate allele at the same position as i have called the variants using two different callers. When i run combine Variants on the both my GT field is not updated properly. AD field is also not updated properly but i ran Variant Annotator and that fixes that issue.

VCF 1:

chr1    87708015    rs58006838  C   T   145.77  .   AC=1;AF=0.500;AN=2;BaseQRankSum=-1.479;DB;DP=14;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-1.109;QD=10.41;ReadPosRankSum=0.925  GT:AD:DP:GQ:PL  0/1:3,9:14:37:174,0,37

VCF 2:

chr1    87708015    .   C   A   .   PASS    AN=2;DP=4;NS=1  GT:AD:DP:GQ 0/1:2,2:4:8.42

Merged VCF file:

chr1    87708015    rs58006838  C   T,A 145.77  PASS    AC=1,0;AF=0.500,0.00;AN=2;BaseQRankSum=-1.479;DB;DP=18;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-1.109;NS=1;QD=10.41;ReadPosRankSum=0.925;set=Intersection GT:AD:DP:GQ **0/1**:3,9,2:14:37

command used:

java -jar $gatk/2.7-1-g42d771f/GenomeAnalysisTK.jar -T CombineVariants -V one.vcf.gz -V two.vcf.gz -o test.vcf -R $ref

Is this a known limitation or a bug?

Post edited by Geraldine_VdAuwera on


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    It looks to me like this is working properly. It's a heterozygous site in both cases. What do you think should be different?

  • sbahetisbaheti Member

    output VCF it is not following the VCF 4.1 format and is not a valid variant according to GATK ValidateVariants walker, possible genotype values for this multi allelic variants is 0/2 1/2 2/2 right ??

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Why would 0/1 not be correct? The sample is heterozygous, and the first of the ALT alleles is chosen (you can change that by specifying a different merge option if you want).

    ValidateVariants is saying that it is not valid? Can you post the output?

  • sbahetisbaheti Member
    edited September 2013

    Here is the command and the error
    java -jar 2.7-1-g42d771f/GenomeAnalysisTK.jar -T ValidateVariants -V variants.vcf.gz -R $ref

    ERROR ------------------------------------------------------------------------------------------
    ERROR A USER ERROR has occurred (version 2.7-1-g42d771f):
    ERROR This means that one or more arguments or inputs in your command are incorrect.
    ERROR The error message below tells you what is the problem.
    ERROR If the problem is an invalid argument, please check the online documentation guide
    ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ERROR MESSAGE: File /data2/bsi/secondary/Kocher_Jean-Pierre_m026645/whole_genome/simulated_normal_8lanes/.tmp/s_normal_8l/variant/chr1_old/variants.vcf.gz fails strict validation: one or more of the ALT allele(s) for the record at position chr1:87708015 are not observed at all in the sample genotypes

    VCF line for the specific line is here :

    zcat variants.vcf.gz | grep 87708015
    chr1 87708015 rs58006838 C T,A 145.77 PASS AC=1,0;AF=0.500,0.00;BaseQRankSum=-1.664;DB;DP=18;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-0.555;NS=1;QD=10.41;ReadPosRankSum=0.925;set=Intersection;ED=11 GT:AD:DP:GQ:SET 0/1:3,9,2:14:37:Intersection



  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah, I see. In the strictest sense it's true that it's an issue because you have two ALT alleles but only one sample. But the genotype itself is properly expressed. Either you use a different merge option so that the second ALT allele is discarded, or you ignore the validation error (because it's not a big problem).

  • sbahetisbaheti Member

    Thanks for tour input, I tried going through the GATK documentation but didn't find the parameter which will allow me to get rid of second allele. Could you let me know how can i do that.

Sign In or Register to comment.