Seeing deletion spanning reads in LocusWalkers

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin
edited October 2012 in Developer Zone

1. Introduction

The LocusTraversal now supports passing walkers reads that have deletions spanning the current locus. This is useful in many situation where you want to calculate coverage, call variants and need to avoid calling variants where there are a lot of deletions, etc.

Currently, the system by default will not pass you deletion-spanning reads. In order to see them, you need to overload the function:

 * (conceptual static) method that states whether you want to see reads piling up at a locus
 * that contain a deletion at the locus.
 * ref:   ATCTGA
 * read1: ATCTGA
 * read2: AT--GA
 * Normally, the locus iterator only returns a list of read1 at this locus at position 3, but
 * if this function returns true, then the system will return (read1, read2) with offsets
 * of (3, -1).  The -1 offset indicates a deletion in the read.
 * @return false if you don't want to see deletions, or true if you do
public boolean includeReadsWithDeletionAtLoci() { return true; }

in your walker. Now you will start seeing deletion-spanning reads in your walker. These reads are flagged with offsets of -1, so that you can:

    for ( int i = 0; i < context.getReads().size(); i++ ) {
        SAMRecord read = context.getReads().get(i);
        int offset = context.getOffsets().get(i);

       if ( offset == -1 ) 

There are also two convenience functions in AlignmentContext to extract subsets of the reads with and without spanning deletions:

 * Returns only the reads in ac that do not contain spanning deletions of this locus
 * @param ac
 * @return
public static AlignmentContext withoutSpanningDeletions( AlignmentContext ac );

 * Returns only the reads in ac that do contain spanning deletions of this locus
 * @param ac
 * @return
public static AlignmentContext withSpanningDeletions( AlignmentContext ac );
Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD



  • siantornosiantorno Cambridge, UKPosts: 7Member

    Hi Geraldine

    I have a question about the meaning of SpanningDeletions. In my understanding entries in VCF files will have a score of Dels > 0.0 if some of the reads have a deletion in them compared to the reference. These are not excluded from variant calling when using UnifiedGenotyper, but the resulting variant will be annotated with the appropriate Dels score. I'm trying to decide whether to throw out variants that have a Dels score larger than 0. Will that artificially bias the resulting variant set against indels? Please let me know if I am misunderstanding the meaning of reads containing spanning deletions.

  • SheilaSheila Broad InstitutePosts: 3,195Member, Broadie, Moderator, Dev admin



    The problem with throwing out all dels > 0 is that you could end up missing valid deletions. You should experiment with different settings and see what works best for you.

    An even better suggestion is to switch from Unified Genotyper to Haplotype Caller. Haplotype Caller is smarter about handling these types of cases.


  • siantornosiantorno Cambridge, UKPosts: 7Member

    It seems like the Dels value is simply not reported for Indels. True SNPs ideally will have dels = 0.0, so if I throw out the SNPs with any dels value > 0 after doing indel realignment I should be okay. Unfortunately I need to use UnifiedGenotyper as I am working with a polyploid genome.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 9,938Administrator, Dev admin

    @siantorno The HaplotypeCaller is now able to work with polyploid genomes just as well as diploids; you just need to specify the -ploidy argument.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.