VariantAnnotator on tiny VCF file

drmjcdrmjc Garvan Institute of Medical ResearchPosts: 11Member

Hi,
I have a tiny, 5000-line VCF file that I want to add dbSNP annotations to.
I'm surprised to see VariantAnnotator iterating along the millions of records in the dbSNP file, rather than the 5000 variants in the input VCF file. This will take 42mins on dbsnp 137...

Am I misunderstanding how this tool works, or just using it wrongly?

thanks, Mark

java -Xmx2g -jar /share/ClusterShare/software/contrib/gi/gatk/2.5/dist/GenomeAnalysisTK.jar -T VariantAnnotator -R /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.5/hg19/ucsc.hg19.fasta --variant PG0000864-BLD_PGx_cleaned.vcf --dbsnp /share/ClusterShare/biodata/contrib/gi/gatk-resource-bundle/2.5/hg19/dbsnp_137.hg19.vcf --out PG0000864-BLD_PGx,GATK.vcf --validation_strictness SILENT

Best Answer

Answers

  • drmjcdrmjc Garvan Institute of Medical ResearchPosts: 11Member

    a colleague pointed out the -L flag which really sped things up. Perhaps I could rephrase the question: if you specify --variant, then should -L be implied?

  • ebanksebanks Broad InstitutePosts: 689Member, Administrator, GATK Dev, Broadie, Moderator, DSDE Dev, GP Member admin

    Not for the Variant Annotator

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • drmjcdrmjc Garvan Institute of Medical ResearchPosts: 11Member

    Thanks for he quick response Eric, but just wondering if you could elaborate? if you only care about the variants within --variants my.vcf, then why look outside of these regions? I'm just trying to get my head around this.
    cheers, Mark

  • drmjcdrmjc Garvan Institute of Medical ResearchPosts: 11Member

    Thanks Geraldine, that makes perfect sense.

Sign In or Register to comment.