The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

# VariantEval Structural Variation

Member Posts: 6

Hi,

I just finished running a fairly large number of WGS samples through HaplotypeCaller and I've been using VariantEval to look at some summary stats on these samples. I've noticed that under '#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table' there's a section on structural variations and that apparently I'm getting about 3500 in one of my samples. Here's the actual section of the table in question:

#:GATKTable:20:3:%s:%s:%s:%s:%s:%d:%d:%d:%.2f:%s:%d:%.2f:%.1f:%d:%s:%d:%.1f:%d:%s:%d:;
#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table
VariantSummary  CompRod  EvalRod  JexlExpression  Novelty  nSamples  nProcessedLoci  nSNPs    TiTvRatio  SNPNoveltyRate  nSNPsPerSample  TiTvRatioPerSample  SNPDPPerSample  nIndels  IndelNoveltyRate  nIndelsPerSample  IndelDPPerSample  nSVs  SVNoveltyRate  nSVsPerSample
VariantSummary  dbsnp    vcf1     none            all             1      3095693981  3446166       2.08            1.34         3446166                2.08             0.0   962028             15.33            962028               0.0  3282          73.58           3282
VariantSummary  dbsnp    vcf1     none            known           1      3095693981  3399907       2.08            0.00         3399907                2.08             0.0   814506              0.00            814506               0.0   867           0.00            867
VariantSummary  dbsnp    vcf1     none            novel           1      3095693981    46259       1.71          100.00           46259                1.71             0.0   147522            100.00            147522               0.0  2415         100.00           2415


I didn't think that HaplotypeCaller even looked for structural variations, so I tried to find these structural variations in the VCF, hoping they were encoded as described here and I couldn't find anything. Could someone tell me why VariantEval is showing a number of structural variations but the actual VCF isn't finding any? Does VariantEval just interpret a sufficiently large indel as a SV? If so, I can understand why it may call some structural variations considering there are indels longer than 1k bp in the indels of the sample.

Thanks,

Grant

Tagged:

• Member Posts: 6

Thanks, that's exactly what I needed to know!

• Member Posts: 15

Hi Geraldine,

Since anything above 50 bp is a structural variation, I wonder if there is a tool that can sort such events into various SV classes i.e inversions, translocations, etc.

Do you know of such a tool ?

Thank you

@Geraldine_VdAuwera said:
Hi Grant,

The convention we use is that events that are 50bp or larger are called SVs.