This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Using IntervalListTools to create scatter intervals for HaplotypeCaller
I was looking at the GATK4 $5 WDL file and see that it uses IntervalListTools to create the interval list for scattering over HaplotypeCaller. In the call to IntervalListTools, it sets the parameter BREAK_BANDS_AT_MULTIPLES_OF to a non-zero value (100,000 in the hg38 json file ). The ScatterIntervalsByNs call to generate the interval list which is used as input to this step is very careful to split at N's, but then in this call we may split in the middle of actual sequence. Won't we potentially be introducing edge effects by doing this? If we split every 100,000 bases, then won't we have issues calling an indel that spans one of these break points?