Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATK CombineVariants complains the contig order in the VCF files

cr517cr517 CambridgeMember

I have called variants on two strains of C. elegans separately. I now want to merge the VCF files into one using the following code:

  • Create a sequence dictionary of the reference sequence
  • Sort the VCF files with Picard
  • Merge the sorted VCF files using GATK

 

picard CreateSequenceDictionary \
    REFERENCE=c_elegans.PRJNA13758.WS263.genomic.fa \
    OUTPUT=c_elegans.PRJNA13758.WS263.genomic.dict

picard SortVcf INPUT=strain1.vcf \
    OUTPUT=strain1sorted.vcf \
    SEQUENCE_DICTIONARY=c_elegans.PRJNA13758.WS263.genomic.dict

picard SortVcf INPUT=strain2.vcf \
    OUTPUT=strain2sorted.vcf  \     
    SEQUENCE_DICTIONARY=c_elegans.PRJNA13758.WS263.genomic.dict

GATK --analysis_type CombineVariants \
    -R c_elegans.PRJNA13758.WS263.genomic.fa \
    --variant strain1sorted.vcf \
    --variant strain2sorted.vcf \
    -o all.vcf \
    -genotypeMergeOptions UNIQUIFY

The last command gives me the following error message:

ERROR MESSAGE: Input files variant and reference have incompatible contigs. Please see https://www.broadinstitute.org/gatk/guide/article?id=63 for more information. Error details: The contig order in variant and reference is not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328),  which describes reordering contigs in BAM and VCF files..
##### ERROR   variant contigs = [I, II, III, IV, MtDNA, V, X]
##### ERROR   reference contigs = [I, II, III, IV, V, X, MtDNA]

But I have sorted the VCF files using Picard, so I don't know what else to do.

Your help is appreciated.

Best Answer

Answers

Sign In or Register to comment.