MappingQualityRankSumTest and ReadPosRankSumTest

tommycarstensentommycarstensen United KingdomPosts: 372Member ✭✭✭

I read the documentation for MappingQualityRankSumTest and ReadPosRankSumTest:

Both pages read:
"The ... rank sum test can not be calculated for sites without a mixture of reads showing both the reference and alternate alleles."

I have quite a few sites for which MQRankSum and ReadPosRankSum are missing. How does VariantRecalibrator handle this missing information?

Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,454Administrator, GATK Dev admin

    Hi Tommy,

    That's a good question, I'm not sure. I expect it just skips that dimension for the variant in question but I don't know how this affects the variant's ranking overall. Unfortunately the one person who knows the model in and out (Ryan Poplin, @rpoplin) is on vacation; I'll see if someone else knows but it may be a while before I can get you an answer.

    Geraldine Van der Auwera, PhD

  • tommycarstensentommycarstensen United KingdomPosts: 372Member ✭✭✭

    Thanks @pdexheimer and @Geraldine_VdAuwera. I wonder what "marginalizing over a dimension via sampling" exactly means, but happy to know, that it has been attended to by @rpoplin. I guess I have to look at the code, if I want further details. I thought maybe missing values would have been imputed somehow or alternatively just set equal to the mean or median. Thanks.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 8,454Administrator, GATK Dev admin

    Honestly I'm not sure exactly what it means either, but it sounds like a reassuringly technical version of "chill out, I got this" :)


    Geraldine Van der Auwera, PhD

Sign In or Register to comment.