GATK VariantAnnotator error

GenomeHackerGenomeHacker BostonMember Posts: 5

I am getting a strange error when running the VariantAnnonator - does anyone know what this is about?

I am running it as
java -jar GenomeAnalysisTK.jar -T VariantAnnotator -R ucsc.hg19.fasta --variant result.vcf -comp:COSMIC CosmicMutants.vcf -resource CosmicMutants.vcf -E resource.ID -alwaysAppendDbsnpId -o annotated.vcf

and the error is

WARN 17:22:19,963 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78827, likely resulting in degraded VCF processing performance
WARN 17:22:20,037 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78828, likely resulting in degraded VCF processing performance
WARN 17:22:20,111 AbstractVCFCodec - Allele detected with length 1133370 exceeding max size 1048576 at approximately line 78827, likely resulting in degraded VCF processing performance
....

My vcf file is not that long...but I have looked in CosmicMutants.vcf, and it seems normal to me.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,287 admin

    Hi there,

    That's just a warning about the length of some of the variants in your file (which are way bigger than what we normally handle) and it's saying that processing this file might go slower than normal because of the extra work. It shouldn't actually harm your run in any way.

    Geraldine Van der Auwera, PhD

  • GenomeHackerGenomeHacker BostonMember Posts: 5

    Thanks, but I don't have any long variant in my file!

  • GenomeHackerGenomeHacker BostonMember Posts: 5

    Plus the code takes a lot of time to run, maybe because of this?

  • ebanksebanks Broad InstituteMember, Administrator, Broadie, Moderator, Dev Posts: 698 admin

    If you don't have long variants in your file then it must be malformed somehow. The GATK seems to think that the sizes are enormous...

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • GenomeHackerGenomeHacker BostonMember Posts: 5

    OK..will look into it more...it actually would be the cosmic file that is malformed because my vcf file does not have line '78827'. Could not find anything from a casual inspection...but will look more.
    Thanks much!

Sign In or Register to comment.