MappingQualityRankSumTest and ReadPosRankSumTest

tommycarstensentommycarstensen United KingdomPosts: 256Member ✭✭

I read the documentation for MappingQualityRankSumTest and ReadPosRankSumTest:
http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_annotator_MappingQualityRankSumTest.html
http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_annotator_ReadPosRankSumTest.html

Both pages read:
"The ... rank sum test can not be calculated for sites without a mixture of reads showing both the reference and alternate alleles."

I have quite a few sites for which MQRankSum and ReadPosRankSum are missing. How does VariantRecalibrator handle this missing information?

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,408Administrator, GATK Developer admin

    Hi Tommy,

    That's a good question, I'm not sure. I expect it just skips that dimension for the variant in question but I don't know how this affects the variant's ranking overall. Unfortunately the one person who knows the model in and out (Ryan Poplin, @rpoplin) is on vacation; I'll see if someone else knows but it may be a while before I can get you an answer.

    Geraldine Van der Auwera, PhD

  • tommycarstensentommycarstensen United KingdomPosts: 256Member ✭✭

    Thanks @pdexheimer and @Geraldine_VdAuwera. I wonder what "marginalizing over a dimension via sampling" exactly means, but happy to know, that it has been attended to by @rpoplin. I guess I have to look at the code, if I want further details. I thought maybe missing values would have been imputed somehow or alternatively just set equal to the mean or median. Thanks.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 7,408Administrator, GATK Developer admin

    Honestly I'm not sure exactly what it means either, but it sounds like a reassuringly technical version of "chill out, I got this" :)

    image

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.