Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GATK CombineVariants complains the contig order in the VCF files
I have called variants on two strains of C. elegans separately. I now want to merge the VCF files into one using the following code:
- Create a sequence dictionary of the reference sequence
- Sort the VCF files with Picard
- Merge the sorted VCF files using GATK
picard CreateSequenceDictionary \ REFERENCE=c_elegans.PRJNA13758.WS263.genomic.fa \ OUTPUT=c_elegans.PRJNA13758.WS263.genomic.dict picard SortVcf INPUT=strain1.vcf \ OUTPUT=strain1sorted.vcf \ SEQUENCE_DICTIONARY=c_elegans.PRJNA13758.WS263.genomic.dict picard SortVcf INPUT=strain2.vcf \ OUTPUT=strain2sorted.vcf \ SEQUENCE_DICTIONARY=c_elegans.PRJNA13758.WS263.genomic.dict GATK --analysis_type CombineVariants \ -R c_elegans.PRJNA13758.WS263.genomic.fa \ --variant strain1sorted.vcf \ --variant strain2sorted.vcf \ -o all.vcf \ -genotypeMergeOptions UNIQUIFY
The last command gives me the following error message:
ERROR MESSAGE: Input files variant and reference have incompatible contigs. Please see https://www.broadinstitute.org/gatk/guide/article?id=63 for more information. Error details: The contig order in variant and reference is not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328), which describes reordering contigs in BAM and VCF files.. ##### ERROR variant contigs = [I, II, III, IV, MtDNA, V, X] ##### ERROR reference contigs = [I, II, III, IV, V, X, MtDNA]
But I have sorted the VCF files using Picard, so I don't know what else to do.
Your help is appreciated.