Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
CombineVariants incorrectly(?) complains about badly formed variant when merging multiallelic site
Hi GATK team,
I am attempting to combine a HaplotypeCaller generated VCF with some indels called using pindel using the following arguments (GATK v3.3-0-g37228af):
-R /data/shared/ref/b37/human_g1k_v37.fasta -T CombineVariants --variant:GATK var.HiSeqDecember.raw.vcf --variant:pindel pindel_combined.vcf -o var.HiSeqDecember.pindel.raw.vcf -genotypeMergeOptions PRIORITIZE -priority GATK,pindel
However I get the following error:
ERROR MESSAGE: Badly formed variant context at location 1:157718231; getEnd() was 157718235 but this VariantContext contains an END key with value 157718231
The variants in question are (from GATK):
1 157718231 . CAAAT C,CAAATAAAT 2533.56 PASS AC=3,13;AF=0.125,0.542;AN=24;BaseQRankSum=1.762;ClippingRankSum=-0.327;DP=126;FS=0.000;HOMLEN=39;HOMSEQ=AAATAAATAAATAAATAAATAAATAAATAAATAAATAAA;InbreedingCoeff=-0.1260;MLEAC=3,13;MLEAF=0.125,0.542;MQ=70.00;MQ0=0;MQRankSum=0.920;QD=22.22;ReadPosRankSum=-0.893;SOR=0.976;SVLEN=4;SVTYPE=INS;set=Intersection GT:DP:GQ 0/0:10:30 0/2:9:18 2/2:6:18 2/2:5:15 0/1:10:99 0/2:14:99 2/2:8:24 2/2:6:18 2/2:7:21 0/2:17:99 0/1:5:75 0/1:6:27
and (from pindel):
1 157718231 . C CAAAT . PASS AC=2;AF=0.143;AN=14;END=157718231;HOMLEN=39;HOMSEQ=AAATAAATAAATAAATAAATAAATAAATAAATAAATAAA;SVLEN=4;SVTYPE=INS;set=variant3-variant4-variant6-variant7-variant8-variant9-variant10 GT:AD ./. ./. 0/0:0,7 0/0:0,6 ./. 0/0:0,9 0/0:0,8 0/0:0,7 0/0:0,8 1/1:0,12 ./. ./.
It is worth noting that the pindel VCF here was merged together from several pindel-generated VCFs using CombineVariants without any complaint from the GATK. It looks to me that the END key is correct for the pindel variant (a simple insertion), but the GATK is confused due to the mixed deletion/insertion variant generated by the HaplotypeCaller at the same position (without an END key).
I can rerun the command after stripping all END tags from the pindel VCF and the command completes successfully, so this is not a showstopper for me but I assume this is a bug(?) and if so, it would be great if there were a fix.