The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

CatVariants does not combine header

rcholicrcholic DenverMember

Below is the command:

java -cp $CLASSPATH/GenomeAnalysisTK.jar \
-R GATK_ref/hg19.fasta \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-1.vcf \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-2.vcf \
-V ../GATK/VQSR/parallel_batch/raw.snps_indels-3.vcf \
-out ../GATK/VQSR/parallel_batch/combined_raw.snps_indels.vcf \
-log ../GATK/VQSR/parallel_batch/log/combined.log \

After this, the combined_raw.snps_indels.vcf file only contains the header from raw.snps_indels-1.vcf, what might be wrong?

Best Answer


  • erikfaserikfas Member

    A related issue I had was that when I was trying to concatenate two VCFs from the same sample, one containing the (filtered) variants and one with non-variants, I got an error saying that the FS filter field wasn't in the header. This was because I had set the non-filtered VCF as first input, making the script take the header from that file, which of course didn't have a FS filter field (because no VariantFiltration had been run on it). It was easily solved by just reversing the input ordering, making the script take the most complete header available. Just an FYI if somebody else runs into a similar problem!

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    FYI, the behavior of CatVariants regarding headers is documented in the tool doc.

    More importantly, this tool is not appropriate for the use you're making of it. As noted above, the tool expects that the input VCFs represent non-overlapping intervals. The way you're using it, that expectation is not satisfied and the output vcf will most probably not be sorted correctly. You should be using CombineVariants for this.

Sign In or Register to comment.