Seeing deletion spanning reads in LocusWalkers

Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,462Administrator, GATK Developer admin
edited October 2012 in Developer Zone

1. Introduction

The LocusTraversal now supports passing walkers reads that have deletions spanning the current locus. This is useful in many situation where you want to calculate coverage, call variants and need to avoid calling variants where there are a lot of deletions, etc.

Currently, the system by default will not pass you deletion-spanning reads. In order to see them, you need to overload the function:

/**
 * (conceptual static) method that states whether you want to see reads piling up at a locus
 * that contain a deletion at the locus.
 *
 * ref:   ATCTGA
 * read1: ATCTGA
 * read2: AT--GA
 *
 * Normally, the locus iterator only returns a list of read1 at this locus at position 3, but
 * if this function returns true, then the system will return (read1, read2) with offsets
 * of (3, -1).  The -1 offset indicates a deletion in the read.
 *
 * @return false if you don't want to see deletions, or true if you do
 */
public boolean includeReadsWithDeletionAtLoci() { return true; }

in your walker. Now you will start seeing deletion-spanning reads in your walker. These reads are flagged with offsets of -1, so that you can:

    for ( int i = 0; i < context.getReads().size(); i++ ) {
        SAMRecord read = context.getReads().get(i);
        int offset = context.getOffsets().get(i);

       if ( offset == -1 ) 
               nDeletionReads++;
        else 
               nCleanReads++;
    }

There are also two convenience functions in AlignmentContext to extract subsets of the reads with and without spanning deletions:

/**
 * Returns only the reads in ac that do not contain spanning deletions of this locus
 * 
 * @param ac
 * @return
 */
public static AlignmentContext withoutSpanningDeletions( AlignmentContext ac );

/**
 * Returns only the reads in ac that do contain spanning deletions of this locus
 * 
 * @param ac
 * @return
 */
public static AlignmentContext withSpanningDeletions( AlignmentContext ac );
Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Comments

  • siantornosiantorno Cambridge, UKPosts: 1Member

    Hi Geraldine

    I have a question about the meaning of SpanningDeletions. In my understanding entries in VCF files will have a score of Dels > 0.0 if some of the reads have a deletion in them compared to the reference. These are not excluded from variant calling when using UnifiedGenotyper, but the resulting variant will be annotated with the appropriate Dels score. I'm trying to decide whether to throw out variants that have a Dels score larger than 0. Will that artificially bias the resulting variant set against indels? Please let me know if I am misunderstanding the meaning of reads containing spanning deletions.

  • SheilaSheila Broad InstitutePosts: 561Member, GATK Developer, Broadie, Moderator admin

    @siantorno

    Hi,

    The problem with throwing out all dels > 0 is that you could end up missing valid deletions. You should experiment with different settings and see what works best for you.

    An even better suggestion is to switch from Unified Genotyper to Haplotype Caller. Haplotype Caller is smarter about handling these types of cases.

    -Sheila

Sign In or Register to comment.