The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.

VariantEval Structural Variation

GrantMarshallGrantMarshall Member Posts: 6


I just finished running a fairly large number of WGS samples through HaplotypeCaller and I've been using VariantEval to look at some summary stats on these samples. I've noticed that under '#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table' there's a section on structural variations and that apparently I'm getting about 3500 in one of my samples. Here's the actual section of the table in question:

#:GATKTable:VariantSummary:1000 Genomes Phase I summary of variants table
VariantSummary  CompRod  EvalRod  JexlExpression  Novelty  nSamples  nProcessedLoci  nSNPs    TiTvRatio  SNPNoveltyRate  nSNPsPerSample  TiTvRatioPerSample  SNPDPPerSample  nIndels  IndelNoveltyRate  nIndelsPerSample  IndelDPPerSample  nSVs  SVNoveltyRate  nSVsPerSample
VariantSummary  dbsnp    vcf1     none            all             1      3095693981  3446166       2.08            1.34         3446166                2.08             0.0   962028             15.33            962028               0.0  3282          73.58           3282
VariantSummary  dbsnp    vcf1     none            known           1      3095693981  3399907       2.08            0.00         3399907                2.08             0.0   814506              0.00            814506               0.0   867           0.00            867
VariantSummary  dbsnp    vcf1     none            novel           1      3095693981    46259       1.71          100.00           46259                1.71             0.0   147522            100.00            147522               0.0  2415         100.00           2415

I didn't think that HaplotypeCaller even looked for structural variations, so I tried to find these structural variations in the VCF, hoping they were encoded as described here and I couldn't find anything. Could someone tell me why VariantEval is showing a number of structural variations but the actual VCF isn't finding any? Does VariantEval just interpret a sufficiently large indel as a SV? If so, I can understand why it may call some structural variations considering there are indels longer than 1k bp in the indels of the sample.



Best Answer


  • GrantMarshallGrantMarshall Member Posts: 6

    Thanks, that's exactly what I needed to know!

  • modi2020modi2020 Member Posts: 15

    Hi Geraldine,

    Since anything above 50 bp is a structural variation, I wonder if there is a tool that can sort such events into various SV classes i.e inversions, translocations, etc.

    Do you know of such a tool ?

    Thank you

    @Geraldine_VdAuwera said:
    Hi Grant,

    The convention we use is that events that are 50bp or larger are called SVs.

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,684 admin

    The only tool I'm aware of for working with SVs is GenomeSTRiP but I don't know if it has a function for classifying them as you ask. There may be others but I'm not up to date on what's going on in that space, to be honest.

    Geraldine Van der Auwera, PhD

  • modi2020modi2020 Member Posts: 15

    Thank you so much Geraldine! :-)

Sign In or Register to comment.