Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

Getting insertion counts

brannonabrannona Posts: 4Member


For matched tumor and normal pairs, we easily get insertion and deletion counts from the output of Somatic Indel Detector in GATK. However, when we run multiple samples from the same patient, sometimes calls are made in one sample but not another, so we might not have the numbers for all samples for all indel events. We can get the deletion counts from Depth of Coverage in GATK, but retrieving insertions is trickier.

Does you have a suggestion for how to solve this problem in an automated (ie non-IGV fashion)?

Additionally, as DepthofCoverage is being retired, what do you recommend that we use for getting SNP and deletion counts?

Thank you

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,285Administrator, GSA Member admin

    Hi there,

    DepthofCoverage is actually getting a reprieve -- we won't retire it until DiagnoseTargets is able to completely take over the DoC functionality.

    Unfortunately we don't have experience with cancer / somatic mutations, so we can't really advise you on this topic. Perhaps someone in the user community can give you some pointers.

    Geraldine Van der Auwera, PhD

  • brannonabrannona Posts: 4Member

    I'm glad to hear that DoC will remain active for a while.
    My other question does not require any knowledge of cancer or somatic mutations, so I apologize for not being concise. Reworded: Is there a GATK tool that I can use to get counts of specific indels? (Something like BaseCounts or DoC for indels.) Thank you.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,285Administrator, GSA Member admin

    Do you mean counting in how many of the patient's samples a specific indel occurs? If so I don't think we have a specific tool to do that, but you could just call indels on the interval where the indel occurs, then use the variant manipulation tools to find out the counts. Does that make sense?

    Geraldine Van der Auwera, PhD

  • brannonabrannona Posts: 4Member

    Hi Geraldine, I mean counting how many of the reads in a bam or sample does a specific indel occur. The issue is that while it may occur in that sample, it may be below the threshold of what UnifiedGenotyper would call. For example, if there's only 2 indels out of 634 reads, UnifiedGenotyper would likely not call that, but we still need to retrieve that data.
    Thank you.

  • brannonabrannona Posts: 4Member

    Thank you! I'll need to play around with the read filters a bit, but I think this will work.

Sign In or Register to comment.