We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

CombineVariant Codec not working with VCFv4.1?

aminziaaminzia aminziaMember
edited June 2015 in Ask the GATK team

Hello there,

I am using CombineVariant to combine variants called by HaplotypCaller on seepage chromosomes. They come directory from the HC without any change. But my CombineVariant seems to not like the header. Please see the following errors I get which is a bit unexpected given that everything used to be working fine before GATK-v3.4 (except for now the merging options are modified and no longer working like before v3.3).

Thank you
Amin Zia

java -Xmx8g -Xms8g -jar ~/gatk-3.4.0/GenomeAnalysisTK.jar -R ucsc.hg19.fasta -T CombineVariants --variant:VCF1 chrM.gatk.vcf --variant:VCF2 chr1.gatk.vcf --variant:VCF3 chr2.gatk.vcf --assumeIdenticalSamples -genotypeMergeOptions PRIORITIZE -priority VCF1,VCF2,VCF3 -o genome.gatk.vcf

INFO 17:25:36,611 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:25:36,613 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-0-g7e26428, Compiled 2015/05/15 03:25:41
INFO 17:25:36,613 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 17:25:36,613 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 17:25:36,619 HelpFormatter - Program Args: -R /srv/gs1/projects/scg/Resources/GATK/hg19-3.0//ucsc.hg19.fasta -T CombineVariants --variant:VCF1 chrM.gatk.vcf --variant:VCF2 chr1.gatk.vcf --variant:VCF3 chr2.gatk.vcf --assumeIdenticalSamples -genotypeMergeOptions PRIORITIZE -priority VCF1,VCF2,VCF3 -o genome.gatk.vcf
INFO 17:25:36,622 HelpFormatter - Executing as [email protected] on Linux 2.6.32-504.16.2.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.7.0_03-b04.
INFO 17:25:36,622 HelpFormatter - Date/Time: 2015/06/29 17:25:36
INFO 17:25:36,622 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:25:36,622 HelpFormatter - --------------------------------------------------------------------------------
INFO 17:25:37,369 GenomeAnalysisEngine - Strictness is SILENT
INFO 17:25:37,518 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 17:25:38,886 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.4-0-g7e26428):
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR MESSAGE: Unable to parse header with error: Your input file has a malformed header: This codec is strictly for VCFv3 and does not support VCFv4.1, for input source: chr2.gatk.vcf
ERROR ------------------------------------------------------------------------------------------

Best Answers


  • aminziaaminzia aminziaMember

    Thank you for your answer. It actually worked. And I'm still surprised how GATK assumes that label as name for a codec.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    edited July 2015

    @aminzia This is because the GATK engine will parse tags according to a position-sensitive logic:

    --variant:<FORMAT>,<NAME> file.vcf

    So if you only provide one tag, it will assume it is the format codec. It's not possible to specify a name tag without also specifying a format tag.

  • thibaultthibault Broad InstituteMember, Broadie, Dev admin

    Because the standard VCF 4 codec is named VCF, an alternate solution would be to specify the arguments like this:

    --variant:VCF,VCF1 chrM.gatk.vcf --variant:VCF,VCF2 chr1.gatk.vcf --variant:VCF,VCF3 chr2.gatk.vcf
  • aminziaaminzia aminziaMember

    Thank you all for your answers. I think this was not clear from the API pages specially because it's explicitly mentioned "-V:name,vcf" in CombineVariant page which seems to be the other way around and without any mention of how this information is decoded.

    But your explanations clear this. Thank you.


    Issue · Github
    by Geraldine_VdAuwera

    Issue Number
    Last Updated
Sign In or Register to comment.