Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

VCF INFO annotation for non-variant sites

feilchenfeldtfeilchenfeldt ViennaMember
edited July 2017 in Ask the GATK team

Hello,
I am calling variants using HaplotypeCaller --> CombineGVCFs --> GenotypeGVCFs.
To determine the accessible genome, I emit all sites in the last step and I want to get several INFO annotations for all sites, such as DP, MQ, MQ0, .... Hence, I use VariantAnnotator with a regional all-individuals .bam file as input to get these annotations. However, this does not work as expected:

(1) Even after running this, the only annotations that non-variant sites get are: AN, DP. They do not get MQ, MQ0 etc. annotations. It should be possible to get these annotations for non-variant sites and I remember that it was possible with UnifiedGenotyper.

(2) Despite adding "--annotation RMSMappingQuality" and "--annotation MappingQualityZero", these annotations are not added, even not for variant sites.

Could you please let me know how to add annotations to non-variant sites using VariantAnnotator?

Here is my command line:


java-Xmx14000m \
-jar GenomeAnalysisTK-3.7-0/GenomeAnalysisTK.jar \
-R reference.fa \
-nt 8 \
-T VariantAnnotator \
--annotation RMSMappingQuality \
--annotation Coverage \
--annotation MappingQualityZero \
--annotation AlleleBalance \
--annotation SpanningDeletions \
--annotation InbreedingCoeff \
--annotation MappingQualityZero \
-I all_individuals_region_X.bam \
-V all_sites_region_X.vcf.gz \
--out all_sites_region_X_annotated.vcf.gz \

I tried both GATK 3.4-46 and 3.7.
Many thanks for your help.

Hannes

Post edited by feilchenfeldt on

Issue · Github
by shlee

Issue Number
2273
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
sooheelee

Answers

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi Hannes (@feilchenfeldt),

    I'm not familiar with VariantAnnotator and half the team is away at a workshop. So I'll let them know to check back with you.

    In the meanwhile, two things. First, perhaps you will find the annotation descriptions helpful:

    It appears that due to HC's upfront filtering of MQ0, the MappingQualityZero annotation always returns a value of 0.

    Second, to help those who will answer your question, can you please post some example records for a site, e.g. the before and after VCF records? Thanks.

  • feilchenfeldtfeilchenfeldt ViennaMember

    Hello shlee (@shlee),

    I am currently doing some more debugging and will get back to you with more information.

    Many thanks,
    Hannes

  • feilchenfeldtfeilchenfeldt ViennaMember

    Hello shlee,

    I found the original problem: The samples in the vcf were renamed with respect to the bam. I corrected this by changing the bam header with samtools reheader.

    VariantAnnotator now works as expected except for one problem:
    For all non-variant sites it outputs "MQ=Infinity" instead of the correct root mean square mapping quality.
    UnifiedGenotyper gives the correct MQ also for non-variant sites.

    chrom1 159 . G . . . AN=626;DP=8523;MQ=Infinity;MQ0=7514 GT:DP:RGQ 0/0:11:0
    chrom1 160 . C . . . AN=626;DP=8524;MQ=Infinity;MQ0=7511 GT:DP:RGQ 0/0:11:0
    chrom1 161 . C T 615.91 . <other_annotations...>;MQ=48.71;MQ0=7509;<other_annotations...>
    chrom1 162 . G . . . AN=626;DP=8535;MQ=Infinity;MQ0=7508 GT:DP:RGQ 0/0:11:0

    Is there anything I can do to make VariantAnnotator output RMSMappingQuality for non-variant sites, just as UnifiedGenotyper does?

    (I could probably pass a UnifiedGentyper all-sites VCF to VariantAnnotator to annotate the HaplotypeCaller VCF, but this is quite a waste of computational resources, because I have lots of data.)

    Thanks and best wishes,
    Hannes

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi Hannes (@feilchenfeldt),

    That the RMS Mapping Quality value is Infinity does not make sense. So we think this is a bug in VariantAnnotator. Would you mind sharing a snippet of your data in a bug report so we can recapitulate this result on our end? Instruction for bug reports are in Article#1894.

Sign In or Register to comment.