The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Seeing deletion spanning reads in LocusWalkers

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
edited October 2012 in Developer Zone

1. Introduction

The LocusTraversal now supports passing walkers reads that have deletions spanning the current locus. This is useful in many situation where you want to calculate coverage, call variants and need to avoid calling variants where there are a lot of deletions, etc.

Currently, the system by default will not pass you deletion-spanning reads. In order to see them, you need to overload the function:

 * (conceptual static) method that states whether you want to see reads piling up at a locus
 * that contain a deletion at the locus.
 * ref:   ATCTGA
 * read1: ATCTGA
 * read2: AT--GA
 * Normally, the locus iterator only returns a list of read1 at this locus at position 3, but
 * if this function returns true, then the system will return (read1, read2) with offsets
 * of (3, -1).  The -1 offset indicates a deletion in the read.
 * @return false if you don't want to see deletions, or true if you do
public boolean includeReadsWithDeletionAtLoci() { return true; }

in your walker. Now you will start seeing deletion-spanning reads in your walker. These reads are flagged with offsets of -1, so that you can:

    for ( int i = 0; i < context.getReads().size(); i++ ) {
        SAMRecord read = context.getReads().get(i);
        int offset = context.getOffsets().get(i);

       if ( offset == -1 ) 

There are also two convenience functions in AlignmentContext to extract subsets of the reads with and without spanning deletions:

 * Returns only the reads in ac that do not contain spanning deletions of this locus
 * @param ac
 * @return
public static AlignmentContext withoutSpanningDeletions( AlignmentContext ac );

 * Returns only the reads in ac that do contain spanning deletions of this locus
 * @param ac
 * @return
public static AlignmentContext withSpanningDeletions( AlignmentContext ac );
Post edited by Geraldine_VdAuwera on


  • siantornosiantorno Cambridge, UKMember

    Hi Geraldine

    I have a question about the meaning of SpanningDeletions. In my understanding entries in VCF files will have a score of Dels > 0.0 if some of the reads have a deletion in them compared to the reference. These are not excluded from variant calling when using UnifiedGenotyper, but the resulting variant will be annotated with the appropriate Dels score. I'm trying to decide whether to throw out variants that have a Dels score larger than 0. Will that artificially bias the resulting variant set against indels? Please let me know if I am misunderstanding the meaning of reads containing spanning deletions.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator



    The problem with throwing out all dels > 0 is that you could end up missing valid deletions. You should experiment with different settings and see what works best for you.

    An even better suggestion is to switch from Unified Genotyper to Haplotype Caller. Haplotype Caller is smarter about handling these types of cases.


  • siantornosiantorno Cambridge, UKMember

    It seems like the Dels value is simply not reported for Indels. True SNPs ideally will have dels = 0.0, so if I throw out the SNPs with any dels value > 0 after doing indel realignment I should be okay. Unfortunately I need to use UnifiedGenotyper as I am working with a polyploid genome.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    @siantorno The HaplotypeCaller is now able to work with polyploid genomes just as well as diploids; you just need to specify the -ploidy argument.

Sign In or Register to comment.