VCF INFO annotation for non-variant sites

feilchenfeldtfeilchenfeldt ViennaMember
edited July 2017 in Ask the GATK team

I am calling variants using HaplotypeCaller --> CombineGVCFs --> GenotypeGVCFs.
To determine the accessible genome, I emit all sites in the last step and I want to get several INFO annotations for all sites, such as DP, MQ, MQ0, .... Hence, I use VariantAnnotator with a regional all-individuals .bam file as input to get these annotations. However, this does not work as expected:

(1) Even after running this, the only annotations that non-variant sites get are: AN, DP. They do not get MQ, MQ0 etc. annotations. It should be possible to get these annotations for non-variant sites and I remember that it was possible with UnifiedGenotyper.

(2) Despite adding "--annotation RMSMappingQuality" and "--annotation MappingQualityZero", these annotations are not added, even not for variant sites.

Could you please let me know how to add annotations to non-variant sites using VariantAnnotator?

Here is my command line:

<br /> java-Xmx14000m \<br /> -jar GenomeAnalysisTK-3.7-0/GenomeAnalysisTK.jar \<br /> -R reference.fa \<br /> -nt 8 \<br /> -T VariantAnnotator \<br /> --annotation RMSMappingQuality \<br /> --annotation Coverage \<br /> --annotation MappingQualityZero \<br /> --annotation AlleleBalance \<br /> --annotation SpanningDeletions \<br /> --annotation InbreedingCoeff \<br /> --annotation MappingQualityZero \<br /> -I all_individuals_region_X.bam \<br /> -V all_sites_region_X.vcf.gz \<br /> --out all_sites_region_X_annotated.vcf.gz \<br />

I tried both GATK 3.4-46 and 3.7.
Many thanks for your help.


Post edited by feilchenfeldt on

Issue · Github
by shlee

Issue Number
Last Updated
Closed By


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi Hannes (@feilchenfeldt),

    I'm not familiar with VariantAnnotator and half the team is away at a workshop. So I'll let them know to check back with you.

    In the meanwhile, two things. First, perhaps you will find the annotation descriptions helpful:

    It appears that due to HC's upfront filtering of MQ0, the MappingQualityZero annotation always returns a value of 0.

    Second, to help those who will answer your question, can you please post some example records for a site, e.g. the before and after VCF records? Thanks.

  • feilchenfeldtfeilchenfeldt ViennaMember

    Hello shlee (@shlee),

    I am currently doing some more debugging and will get back to you with more information.

    Many thanks,

  • feilchenfeldtfeilchenfeldt ViennaMember

    Hello shlee,

    I found the original problem: The samples in the vcf were renamed with respect to the bam. I corrected this by changing the bam header with samtools reheader.

    VariantAnnotator now works as expected except for one problem:
    For all non-variant sites it outputs "MQ=Infinity" instead of the correct root mean square mapping quality.
    UnifiedGenotyper gives the correct MQ also for non-variant sites.

    chrom1 159 . G . . . AN=626;DP=8523;MQ=Infinity;MQ0=7514 GT:DP:RGQ 0/0:11:0
    chrom1 160 . C . . . AN=626;DP=8524;MQ=Infinity;MQ0=7511 GT:DP:RGQ 0/0:11:0
    chrom1 161 . C T 615.91 . <other_annotations...>;MQ=48.71;MQ0=7509;<other_annotations...>
    chrom1 162 . G . . . AN=626;DP=8535;MQ=Infinity;MQ0=7508 GT:DP:RGQ 0/0:11:0

    Is there anything I can do to make VariantAnnotator output RMSMappingQuality for non-variant sites, just as UnifiedGenotyper does?

    (I could probably pass a UnifiedGentyper all-sites VCF to VariantAnnotator to annotate the HaplotypeCaller VCF, but this is quite a waste of computational resources, because I have lots of data.)

    Thanks and best wishes,

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi Hannes (@feilchenfeldt),

    That the RMS Mapping Quality value is Infinity does not make sense. So we think this is a bug in VariantAnnotator. Would you mind sharing a snippet of your data in a bug report so we can recapitulate this result on our end? Instruction for bug reports are in Article#1894.

Sign In or Register to comment.