To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Realignment intervals distribution

I was just wondering what you guys thought of my realignment intervals length distribution.
This is 30Mb from a single diploid sample without prior indel position information. Approximately 60,000 events , i.e. one every fifty bases seems like a lot.
How indicative of true indels is the data from TargetCreator and IndelRealigner? Guess I'll have to check with the ug-vcf calls...
Across the genome, distribution of 'all' events is uniform.
Does multi-sample realignment improve the accuracy or efficiency of the realignment process ?

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Not indicative at all. You shouldn't be using the intervals from RTC to assess indel distribution in your data. These just indicate regions that are a little messy in the original alignments.

  • BlueBlue Member

    Thanks for the response. I wasn't strictly intending on using the realignment intervals for inde/CNV detection, I was just wondering what they meant, and whether they could be useful for something. Would comparing the intervals between samples be informative?

    The main thing I was wondering was if the distribution of alignment in interval sizes looked reasonable to you. It's a Drosophila genome so maybe it's different from the one's you're used to.

Sign In or Register to comment.