MQRankSum and ReadPosRankSum for SNPs in a haploid organism?

rorycraigrorycraig EdinburghMember

Hi,

Apologies if this has been addressed previously. I'm working with genomic resequencing data for a haploid organism, and I have created a VCF file using GenotypeGVCFs from 33 gVCFs created using HaplotypeCaller (using best practices). I did not set the ploidy option when using GenotypeGVCFs as directed. My final aim is to filter for a subset of high-quality SNPs for a downstream analysis.

As I understand it the parameters MQRankSum and ReadPosRankSum can only be calculated if there is an individual with a heterozygous genotype (ref and alt alleles) at that position. Around 15% of my SNPs have been scored for these parameters, can anyone explain what this means for a haploid? Are these sites good candidates to filter outright?

An example SNP is below:

chromosome_1 3316 . G A 492.42 . AC=3;AF=0.136;AN=22;BaseQRankSum=0.731;ClippingRankSum=1.70;DP=1488;FS=0.000;MLEAC=3;MLEAF=0.136;MQ=31.59;MQRankSum=-5.660e-01;QD=16.98;ReadPosRankSum=0.731;SOR=1.308 GT:AD:DP:GQ:PL 0:86,0:86:99:0,1800 0:89,0:89:99:0,1800 0:147,0:147:99:0,1800 0:51,5:56:99:0,1800 1:1,4:5:80:80,0 0:271,0:271:99:0,1800 1:0,8:8:99:247,0 0:21,1:22:99:0,814 .:0,0 1:5,11:16:99:211,0 0:140,72:212:99:0,1800 0:242,13:255:99:0,1800 0:252,0:252:99:0,1800 .:0,0 .:0,0 .:0,0 0:1,0:1:44:0,44 .:0,0 .:0,0 0:3,0:3:99:0,112 0:1,0:1:39:0,39 .:0,0 .:0,0 .:0,0 .:0,0 0:17,0:17:99:0,360 0:1,0:1:37:0,37 0:4,0:4:99:0,135 0:10,0:10:99:0,270 0:3,0:3:99:0,119 0:3,0:3:99:0,111 0:3,0:3:99:0,119 .:0,0

Cheers,
Rory

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @rorycraig
    Hi Rory,

    Can you confirm that the RankSum annotations do not appear in the final VCF if you do set ploidy in GenotypeGVCFs? I don't think you should use the RankSum annotations for haploid samples, as the annotation is meant for diploid samples.

    -Sheila

  • rorycraigrorycraig EdinburghMember

    Hi Sheila, sorry for the slow reply. I can confirm that these annotations do still appear if ploidy is set to 1 in the GenotypeGVCFs command. Do you have any insight on whether it's best to ignore these annotations, or actively filter them? Thanks!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @rorycraig
    Hi,

    We don't have any recommendations for using or not using rank sum annotations in haploid (or non-diploid) samples. I think the best thing to do is try both ways (filtering with and without the rank sum annotations) and see which works best for your dataset.

    -Sheila

Sign In or Register to comment.