Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

possible issue with VCF files made by GenotypeGVCFs in 3.2-2??

estif74estif74 Saint Paul, MN, USAMember

Hi there,

I'm trying to follow best practices and also use the latest version of GATK/Queue. I've created 2 GVCF files, and I've run GenotypeGVCFs on them to create a VCF file. These are the commands:

java -jar -Xmx8g /home/sgfriede/gene_apps/gatk-3.2-2/GenomeAnalysisTK.jar -T GenotypeGVCFs \
-R /gpfs_share/sgfriede/ref_data/canFam3.fa \
--variant /gpfs_share/sgfriede/poodles_addisons/cocoa_wilt/gvcf/cocoa_wilt.sorted.dedup.haplotypecaller.g.vcf \
--variant /gpfs_share/sgfriede/poodles_addisons/ellie_traska/gvcf/ellie_traska.sorted.dedup.haplotypecaller.g.vcf \
-o /gpfs_share/sgfriede/poodles_addisons/affected_poodles/20140722/vcf/affected_poodles.vcf

From this, I get the output file no problem.

However, when I try to run VQSR on this VCF (affected_poodles.vcf) file I get the following error message:

ERROR MESSAGE: Line 5082: there aren't enough columns for line 0,3:3:9:71,9,0 0/0:2,0:2:6:0,6,61 (we expected 9 tokens, and saw 2 ), for input source: /gpfs_share/sgfriede/poodles_addisons/affected_poodles/20140722/vcf/affected_poodles.vcf

I think (but I could be wrong!!) I've narrowed this down to something with the output format from GenotypeGVCFs in GATK 3.2-2, because when I run VQSR on the same file but created using GenotypeGVCFs with GATK 3.1-1, VQSR runs just fine (and I'm calling the VQSR walker using 3.2-2).

It is also possible I'm doing something wrong here, but I thought I'd see if any of the experts might have an idea.

Appreciate your help as always!



Sign In or Register to comment.