The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.4 has MAJOR CHANGES that impact throughput of pipelines. Default compression is now 1 instead of 5, and Picard now handles compressed data with the Intel Deflator/Inflator instead of JDK.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

CombineVariants: GT field is not updated when merging variants with different ALT alleles

sbahetisbaheti Member
edited September 2013 in Ask the GATK team

HI

my two VCF files have different alternate allele at the same position as i have called the variants using two different callers. When i run combine Variants on the both my GT field is not updated properly. AD field is also not updated properly but i ran Variant Annotator and that fixes that issue.

VCF 1:

chr1    87708015    rs58006838  C   T   145.77  .   AC=1;AF=0.500;AN=2;BaseQRankSum=-1.479;DB;DP=14;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-1.109;QD=10.41;ReadPosRankSum=0.925  GT:AD:DP:GQ:PL  0/1:3,9:14:37:174,0,37

VCF 2:

chr1    87708015    .   C   A   .   PASS    AN=2;DP=4;NS=1  GT:AD:DP:GQ 0/1:2,2:4:8.42

Merged VCF file:

chr1    87708015    rs58006838  C   T,A 145.77  PASS    AC=1,0;AF=0.500,0.00;AN=2;BaseQRankSum=-1.479;DB;DP=18;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-1.109;NS=1;QD=10.41;ReadPosRankSum=0.925;set=Intersection GT:AD:DP:GQ **0/1**:3,9,2:14:37

command used:

java -jar $gatk/2.7-1-g42d771f/GenomeAnalysisTK.jar -T CombineVariants -V one.vcf.gz -V two.vcf.gz -o test.vcf -R $ref

Is this a known limitation or a bug?

Post edited by Geraldine_VdAuwera on
Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    It looks to me like this is working properly. It's a heterozygous site in both cases. What do you think should be different?

  • output VCF it is not following the VCF 4.1 format and is not a valid variant according to GATK ValidateVariants walker, possible genotype values for this multi allelic variants is 0/2 1/2 2/2 right ??

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Why would 0/1 not be correct? The sample is heterozygous, and the first of the ALT alleles is chosen (you can change that by specifying a different merge option if you want).

    ValidateVariants is saying that it is not valid? Can you post the output?

  • sbahetisbaheti Member
    edited September 2013

    Here is the command and the error
    java -jar 2.7-1-g42d771f/GenomeAnalysisTK.jar -T ValidateVariants -V variants.vcf.gz -R $ref

    ERROR ------------------------------------------------------------------------------------------
    ERROR A USER ERROR has occurred (version 2.7-1-g42d771f):
    ERROR
    ERROR This means that one or more arguments or inputs in your command are incorrect.
    ERROR The error message below tells you what is the problem.
    ERROR
    ERROR If the problem is an invalid argument, please check the online documentation guide
    ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
    ERROR
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
    ERROR
    ERROR MESSAGE: File /data2/bsi/secondary/Kocher_Jean-Pierre_m026645/whole_genome/simulated_normal_8lanes/.tmp/s_normal_8l/variant/chr1_old/variants.vcf.gz fails strict validation: one or more of the ALT allele(s) for the record at position chr1:87708015 are not observed at all in the sample genotypes

    VCF line for the specific line is here :

    zcat variants.vcf.gz | grep 87708015
    chr1 87708015 rs58006838 C T,A 145.77 PASS AC=1,0;AF=0.500,0.00;BaseQRankSum=-1.664;DB;DP=18;Dels=0.00;FS=0.000;HaplotypeScore=3.9299;MLEAC=1;MLEAF=0.500;MQ=68.54;MQ0=0;MQRankSum=-0.555;NS=1;QD=10.41;ReadPosRankSum=0.925;set=Intersection;ED=11 GT:AD:DP:GQ:SET 0/1:3,9,2:14:37:Intersection

    Thanks

    Saurabh

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Ah, I see. In the strictest sense it's true that it's an issue because you have two ALT alleles but only one sample. But the genotype itself is properly expressed. Either you use a different merge option so that the second ALT allele is discarded, or you ignore the validation error (because it's not a big problem).

  • Thanks for tour input, I tried going through the GATK documentation but didn't find the parameter which will allow me to get rid of second allele. Could you let me know how can i do that.

Sign In or Register to comment.