how to evaluate the accuracy of a variant calling pipeline
I'd like to evaluate the accuracy of my variant calling pipeline. So I made a cohort of 46 Caucasians from 1KG whole exome seq samples. Comparing to 1KG official calling results, the results of one sample are the following.
region Non-Reference.Sensitivity Non-Reference.Discrepancy Overall_Genotype_Concordance chr1 0.714 0.034 0.988 chr22 0.731 0.014 0.986
My questions are the following:
What do those numbers say about my pipeline? good or bad?
Are there better ways to evaluate a variant calling pipeline? In my test, there are a few sources of error: a) the official calling results are from whole genome seq, whereas mine is whole exome seq. Although I restricted the comparison to the capturing region of the exome seq, the coverage depths may be different; 2) the official calling may use a different calling pipeline; 3) The cohort composition are different.