The frontline support team will be unavailable to answer questions until May27th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
Human b37 Indels/dbsnp annotation versions for IndelRealigner & BaseRecalibrator
Dear GATK Team,
I have recently downloaded the GATK Bundle to get the human reference genome and its associated annotations.
After the mapping step on my lane BAM files, I am planning on using IndelRealigner and BaseRecalibrator as it is explained in the "Best Practices v4".
I am always confused about which annotation file I should use for my analysis.
For the Indel realignment, in the command line arguments of RealignerTargetCreator, one have to set the '--known' switch to indicate known indel sites.
But in the annotations folder, you can also find 'dbsnp_135.b37.excluding_sites_after_129.vcf' for dbsnp (version before 1000K genomes). Depending on which one I use the target intervals files are pretty different. So I am really wondering which one should be used in my case ? Or is there any other factor that could drive me to the better choice ?
I have a similar dilemna with base recalibration, "dbsnp_135.b37.vcf" or "dbsnp_135.b37.excluding_sites_after_129.vcf" in the '-knownSites' switch ?
Thanks a lot,