The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

HaplotypeCaller rs annotation

Hi there,
I'm running now the new GATK 2.2-2 version and I noticed an issue with HaplotypeCaller I had in the previous version I was using.
Despite adding the dbSNP ROD to the walker, the emitted VCF doesn't contain rs names in the name field.
On the contrary, UnifiedGenotyper annotates the variants with the appropriate names.

In my .scala code I wrote:

 class HaplotypeCallerArguments (t: Target) extends HaplotypeCaller with UNIVERSAL_GATK_ARGS {
   this.reference_sequence = qscript.referenceFile
   this.intervals = if (qscript.intervals == null) Nil else List(qscript.intervals)
   // Set the memory limit to 6 gigabytes on each command.
   this.memoryLimit = 6
   this.input_file :+= qscript.bamFile
   this.D = qscript.dbSNP_b37
 }

and that is correctly reflected when queue launches the job as

 INFO  16:07:30,655 FunctionEdge - Starting:  'java'  '-Xmx6144m'  '-XX:+UseParallelOldGC'  '-XX:ParallelGCThreads=4'  '-XX:GCTimeLimit=50'  '-XX:GCHeapFreeLimit=10'  '-Djava.io.tmpdir=/SAN/biomed/analysis/tmp'  '-cp' '/share/apps/genomics/Queue-2.2-2-gf44cc4e/Queue.jar'  
 'org.broadinstitute.sting.gatk.CommandLineGATK'  '-T' 'HaplotypeCaller'  '-I' '/SAN/biomed/analysis/recal.list'  '-L' '/SAN/biomed/analysis/.queue/scatterGather/HaplotypeCaller-sg/temp_016_of_300/scatter.intervals'  '-R' '/share/apps/genomics/reference/human_g1k_v37.fasta'  
 '-l' 'INFO'  '-o' '/SAN/biomed/analysis/.queue/scatterGather/HaplotypeCaller-sg/temp_016_of_300/comparisonHC.raw.vcf' '-D' '/share/apps/genomics/reference/gatkresources_hg19_1.5/ftp.broadinstitute.org/bundle/1.5/b37/dbsnp_135.b37.vcf'  

However, my VCF still looks like

grep -v \# HC.raw.vcf | cut -f 1,2,3,4,5 | more
1   762273  .   G   A
1   865738  .   A   G
1   866319  .   G   A
1   866511  .   C   CCCCT
1   871042  .   C   CA
1   874734  .   C   T

Am I doing something wrong?
It would be quite time consuming to launch VariantAnnotation if not necessary, as I understand now the covariates used by VQSR are already emitted by the caller.

thanks,
Francesco

Tagged:

Best Answer

  • rpoplinrpoplin Dev
    Accepted Answer

    Unfortunately the HaplotypeCaller can't annotate the rsIDs yet. We'll work on getting this added for the next release. Thanks for letting us know that you need this functionality.

    Cheers,

Answers

  • rpoplinrpoplin Dev
    Accepted Answer

    Unfortunately the HaplotypeCaller can't annotate the rsIDs yet. We'll work on getting this added for the next release. Thanks for letting us know that you need this functionality.

    Cheers,

  • treddytreddy Member

    I was wondering if this has been updated? I've been calling HaplotypeCaller with the --dbsnp command, but none of the SNPs in my output have IDs, even though they overlap and have the same genotypes as ones in my dbsnp file. I'm using GATK 2.5.2

  • CarneiroCarneiro Charlestown, MAMember

    Not yet, it is a known limitation of the tool that we have listed as a todo but we haven't gotten to it yet.

Sign In or Register to comment.