Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

undefined variable MappingQualityRankSum

Hi all,

I am using GATK version-3.2-2 and called the variants using HaplotypeCaller using the below shown command:

 java -jar GenomeAnalysisTK.jar -R ref.fa -T HaplotypeCaller -I input.vcf -L region.bed -stand_emit_conf 10 -stand_call_conf 30  
 --genotyping_mode DISCOVERY -o var.vcf

And then selected the variants using SelectVariants and filtered using VariantFiltration by following the steps in the tutorial:
https://www.broadinstitute.org/gatk/guide/topic?name=tutorials . However, i met with the following error:

"undefined variable ReadPosRankSum" and undefined variable "MappingQualityRankSum" . The same issue is discussed in the forum but could find a concrete solution to fix this. Could someone help?

Best Answer

Answers

  • KurtKurt Member ✭✭✭

    It's not really an error per se. It just means that your variant sites only contains reads for the variant (homozygous non-reference). Those annotations (ReadPosRanSum and MappingQualityRankSum) only occur at variants sites that have reads that contain both reference and non-reference alleles. It's annoying (you can add in VariantFiltration --logging_level ERROR to ignore it), but there is nothing wrong with your data from that perspective.

  • meharmehar Member ✭✭

    Dear Kurt,

    Thank you for your answer!! I removed those two annotations and ran the command to check for further warnings. It also warns "undefined variable QD". Could it also be ignored?

  • KurtKurt Member ✭✭✭

    I think you misunderstood. I wouldn't remove those two annotations from VariantFiltration altogether. They are useful when doing hard filtering for sites where their are both alternate and reference reads (primarily heterozygous sites, but there are more than a few sites that are homozygous non-reference that will have these annotations as well). All the warnings are telling you is that those two annotations are not present for those particular sites (b/c the annotation can't be calculated unless you have both alternate and reference reads at a site). In regards to QD, I'm not sure why it would not be present (the only thing i can thing of is that there is no confidence in the site, that would have to do more with how you are generating your calls I guess).

  • meharmehar Member ✭✭

    Thanks for your concerned answer. I just removed to see for further warnings with other annotations and not intended to exclude completely...i will use those two annotations in VariantFiltration.

    "how you are generating your calls"? I guess i have showed the HaplotypeCaller command to generate calls in the initial post. Is it not what you mean? if not, i am happy to share further information.

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭

    I think I know the answer to this one. Can you post your command following HaplotypeCaller?

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭

    Instead of MappingQualityRankSum try MQRankSum. It seems to be a common problem. I ran into it myself some weeks ago.

  • meharmehar Member ✭✭

    Hi Tommy,

    Here are the commands following HaplotypeCaller.

     java -jar GenomeAnalysisTK.jar -R ref.fa -T SelectVariants -selectType SNP --variant var.vcf -o snp.vcf
     java -jar GenomeAnalysisTK.jar -R ref.fa -T VariantFiltration --variant snp.vcf --filterExpression 'QD < 2.0' --filterName QD --filterExpression 'FS > 60.0' --filterName FS --filterExpression 'MQ < 40.0' --filterName MQ --filterExpression 'MQRankSum < -12.5' --filterName MQ --filterExpression 'ReadPosRankSum < -8.0' --filterName ReadPos --filterExpression 'DP < 10' --filterName DP -o filt.vcf
    

    As you can see, i have used "MQRankSum" but still gets the warning. Any further thoughts?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @mehar‌

    Hi,

    Can you post your log output?

    Thanks,
    Sheila

  • rbagnallrbagnall Member

    Hi Mehar,

    I am looking over some posts about undefined variables. I think in your case, the --filterName QD should be --filterName 'QD'

    Same applies to all filter names (enclose in ' ' ).

    Hope that helps

  • siriansirian USMember ✭✭

    I was following GATK's hard filtering in the best practices, using "MappingQualityRankSum < -12.5" as one filter. I found out that this filter doesn't recognize "MQRanksum" in the variants so it didn't filter anything out. After I changed it to "MQRankSum < -12.5" it did the job.
    I spent a whole afternoon figuring this out. I wish GATK could change this parameter name in the best practice and affected documentation.
    Or GATK already claims this somewhere but it's me who didn't see it?

Sign In or Register to comment.