The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# Human b37 Indels/dbsnp annotation versions for IndelRealigner & BaseRecalibrator

Posts: 4
edited July 2012

Dear GATK Team,

I have recently downloaded the GATK Bundle to get the human reference genome and its associated annotations.

After the mapping step on my lane BAM files, I am planning on using IndelRealigner and BaseRecalibrator as it is explained in the "Best Practices v4".

I am always confused about which annotation file I should use for my analysis.

For the Indel realignment, in the command line arguments of RealignerTargetCreator, one have to set the '--known' switch to indicate known indel sites.

--known:indels,vcf Mills_and_1000G_gold_standard.indels.b37.sites.vcf
--known:dbsnp,vcf dbsnp_135.b37.vcf

But in the annotations folder, you can also find 'dbsnp_135.b37.excluding_sites_after_129.vcf' for dbsnp (version before 1000K genomes). Depending on which one I use the target intervals files are pretty different. So I am really wondering which one should be used in my case ? Or is there any other factor that could drive me to the better choice ?

I have a similar dilemna with base recalibration, "dbsnp_135.b37.vcf" or "dbsnp_135.b37.excluding_sites_after_129.vcf" in the '-knownSites' switch ?

Thanks a lot,
Best,

Anthony

Tagged: