Pre-processing for GATK pipeling: BWA necessary for SNPs discovery in genomic data?

Hello,
I am working with DNA capture, with paired-end sequences, that I plan to run into GATK pipeline for SNPs and InDel Discovery in genome data.
On the website, GATK suggests to start the pre-processing of the data with bwa mem for mapping the data to a reference. I did that, but I could never found a correct way (and working with my data) to extract only the reads that mapped to my reference uniquely (I have looked for days in forums and else...), and this is critical for me, since I do not want to keep the reads that would map several times to different places if the reference.
Would you have a suggestion for this matter?
Otherwise, would you recommend me to map my data with Bowtie2 instead? To your opinion, do you think I may regret it later in the GATK workflow of analyses?
Any help is more than welcome
Thanks!
Best Answer
-
Sheila Broad Institute admin
Answers
@mac
Hi,
Is there a reason you want to extract only the reads that map uniquely?
Most GATK tools filter out reads that are not mapped properly. You can also have a look at the Read Filters here. You can use PrintReads with some of those filters to remove the non-uniquely mapped reads.
-Sheila