How is the HW proportion test calculated (chisquared, exact, etc.)?
bredeson
Member ✭✭
How is the HWproportion goodnessoffit test calculated? Is the GATK using a chisquare (corrected or not) or an exact method? Is the calculation correctly taking into account multiple alleles (where applicable)?
ebanks Broad Institute ✭✭✭✭
Probability.
The HW code is not written in the GATK directly, but you can access it through Tribble:
http://gatkforums.broadinstitute.org/discussion/1349/tribble
Have a look at the genotyper documentation:
http://gatkforums.broadinstitute.org/discussion/1237/usingtheunifiedgenotyper
That article contains a link to the EXACT mathematical model and to slides covering multiallelic calls.
I have looked through the documentation provided. My interest is in how the VCF annotation tag 'HW' is being calculated, which is not documented in any of the material above (nor on the following page: http://gatkforums.broadinstitute.org/discussion/1268/howshouldiinterpretvcffilesproducedbythegatk). After looking through a few articles online, it becomes apparent to me that there are different ways of carrying out the test for HardyWeinberg proportions that result in different accuracies for estimating the Pvalues for departures from HW equilibrium.
The code calculates an exact twosided hardyweinberg pvalue.
Awesome, thank you. Is the Pvalue estimated using a probability or likelihoodratio test statistic?
Probability.
Thank you, you have been very helpful :)
One last question, are the values of the 'HW' tag in percent or phredscaled?
That's in the docs (see the VCF header).