Bug Bulletin: The recent 3.2 release fixes many issues. If you run into a problem, please try the latest version before posting a bug report, as your problem may already have been solved.

VariantEval Structural Variation

GrantMarshallGrantMarshall Posts: 6Member


I just finished running a fairly large number of WGS samples through HaplotypeCaller and I've been using VariantEval to look at some summary stats on these samples. I've noticed that under '#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table' there's a section on structural variations and that apparently I'm getting about 3500 in one of my samples. Here's the actual section of the table in question:

#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table
VariantSummary CompRod EvalRod JexlExpression Novelty nSamples nProcessedLoci nSNPs TiTvRatio SNPNoveltyRate nSNPsPerSample TiTvRatioPerSample SNPDPPerSample nIndels IndelNoveltyRate nIndelsPerSample IndelDPPerSample nSVs SVNoveltyRate nSVsPerSample
VariantSummary dbsnp vcf1 none all 1 3095693981 3446166 2.08 1.34 3446166 2.08 0.0 962028 15.33 962028 0.0 3282 73.58 3282
VariantSummary dbsnp vcf1 none known 1 3095693981 3399907 2.08 0.00 3399907 2.08 0.0 814506 0.00 814506 0.0 867 0.00 867
VariantSummary dbsnp vcf1 none novel 1 3095693981 46259 1.71 100.00 46259 1.71 0.0 147522 100.00 147522 0.0 2415 100.00 2415

I didn't think that HaplotypeCaller even looked for structural variations, so I tried to find these structural variations in the VCF, hoping they were encoded as described here and I couldn't find anything. Could someone tell me why VariantEval is showing a number of structural variations but the actual VCF isn't finding any? Does VariantEval just interpret a sufficiently large indel as a SV? If so, I can understand why it may call some structural variations considering there are indels longer than 1k bp in the indels of the sample.



Best Answer


Sign In or Register to comment.