The front line support team will be unavailable to answer questions until May 27th 2019 as we are celebrating Memorial Day. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
problem using -L and -XL option in PrintReads
Dear GATK team,
I am using the -L and -XL options together in my PrintReads command to only print reads in regions of interest and to exclude reads in Blacklisted regions.
Below is an example command:
java -Xmx4g -jar GenomeAnalysisTK.jar -T PrintReads -R GRCh37-lite.fa -L 1:1-61913548 -XL all_Enhancers.intervals -XL Blacklist_merged.intervals -I input.bam -nct 8 -BQSR recal_data.wg.table -o output.bam
However, I've noticed that the output.bam has reads that are in intervals of my Blacklist file. For example,
HWI-ST1133:217:D1D4WACXX:8:1206:15633:92032 163 1 19022 10 101M = 19168 247 TCCCCAGACATCCCTGTGGCTGGCTCCTGATGCCCGAGGCCCAAGTGTCTGATGCTTTAAGGCACATCACCCCACTCATGCTTTTCCATGTTCTTTGGCCC 249<ADCDEEEDAADEDABDCEBDDEBEBDFBEBB4B:@CDBCA?FFHFE>DGGDGBBCCDCEFDEE?GEFFGGCHB=EAHIDJJ?M=GAHEGHKD<=@D# X0:i:7 X1:i:0 MC:Z:101M BD:Z:IIMJKNOLLHLLLILMHGMMMMMMMJLLMMKNNNJKLMNNNJNJLNHINKNNLNNNJCKKLNNNHIMMLHNKKOINLMNOOOKDDMOPOPKNONNGRNNNI MD:Z:100G0 RG:Z:0.2 XG:i:0 BI:Z:LLOLLQQMOKOONJNOJINNPPOOPKNOPPOPPOKMONOPPLQMOQKLPNQQPQQQMFOMOOPPKMQQPLPMMRLQMPQRRRNGHQQSRSOPRQPJTPPPL AM:i:0 NM:i:1 SM:i:0 XM:i:1 XO:i:0 MQ:i:18 XT:A:R
My Blacklist file includes the interval: 1:18906-19049, so I thought the above read would not show up in my output .bam.
Do you know where I've made a mistake, and is there a better way to exclude reads in Blacklist regions?
Thanks a lot for your help!