GATK licensing moves to direct-through-Broad model -- read about it on the GATK blog

GATK VariantAnnotator error

GenomeHackerGenomeHacker BostonPosts: 5Member

I am getting a strange error when running the VariantAnnonator - does anyone know what this is about?

I am running it as
java -jar GenomeAnalysisTK.jar -T VariantAnnotator -R ucsc.hg19.fasta --variant result.vcf -comp:COSMIC CosmicMutants.vcf -resource CosmicMutants.vcf -E resource.ID -alwaysAppendDbsnpId -o annotated.vcf

and the error is

WARN 17:22:19,963 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78827, likely resulting in degraded VCF processing performance
WARN 17:22:20,037 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78828, likely resulting in degraded VCF processing performance
WARN 17:22:20,111 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78827, likely resulting in degraded VCF processing performance
....

My vcf file is not that long...but I have looked in CosmicMutants.vcf, and it seems normal to me.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,566Administrator, GATK Developer admin

    Hi there,

    That's just a warning about the length of some of the variants in your file (which are way bigger than what we normally handle) and it's saying that processing this file might go slower than normal because of the extra work. It shouldn't actually harm your run in any way.

    Geraldine Van der Auwera, PhD

  • GenomeHackerGenomeHacker BostonPosts: 5Member

    Thanks, but I don't have any long variant in my file!

  • GenomeHackerGenomeHacker BostonPosts: 5Member

    Plus the code takes a lot of time to run, maybe because of this?

  • ebanksebanks Broad InstitutePosts: 684Member, Administrator, GATK Developer, Broadie, Moderator, DSDE Member, GP Member admin

    If you don't have long variants in your file then it must be malformed somehow. The GATK seems to think that the sizes are enormous...

    Eric Banks, PhD -- Senior Group Leader, MPG Analysis, Broad Institute of Harvard and MIT

  • GenomeHackerGenomeHacker BostonPosts: 5Member

    OK..will look into it more...it actually would be the cosmic file that is malformed because my vcf file does not have line '78827'. Could not find anything from a casual inspection...but will look more.
    Thanks much!

Sign In or Register to comment.