Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Genotype concordance output

MUHAMMADSOHAILRAZAMUHAMMADSOHAILRAZA Beijing Institute of Genomics, CASMember ✭✭

Hi everyone,

I used GATK "GenotypeConcordance" walker against HC and UG VCF files. Firstly i try to compare HC-UG files (taking HC.vcf as 'comp' and UG.vcf as 'eval'), I also performed HC-hapmap and UG-hapmap comparisons. I have some confusions in interpreting the output of the files, regarding NRS, NRD, overall-concordance values.

My output is as follows:

**HC-Hapmap comparison **
GATKTable:4:1:%s:%.3f:%.3f:%.3f:;
GATKTable:GenotypeConcordance_Summary:Per-sample summary statistics: NRS, NRD, and OGC
Sample Non-Reference Sensitivity Non-Reference Discrepancy Overall_Genotype_Concordance
ALL 0.000 1.000 1.000

GATKTable:6:1:%d:%d:%d:%d:%d:%d:;
GATKTable:SiteConcordance_Summary:Site-level summary statistics
ALLELES_MATCH EVAL_SUPERSET_TRUTH EVAL_SUBSET_TRUTH ALLELES_DO_NOT_MATCH EVAL_ONLY TRUTH_ONLY
5254569 655 416 2279 16675 2728180

UG-Hapmap comparison
GATKTable:4:1:%s:%.3f:%.3f:%.3f:;
GATKTable:GenotypeConcordance_Summary:Per-sample summary statistics: NRS, NRD, and OGC
Sample Non-Reference Sensitivity Non-Reference Discrepancy Overall_Genotype_Concordance
ALL 0.000 1.000 1.000

GATKTable:6:1:%d:%d:%d:%d:%d:%d:;
GATKTable:SiteConcordance_Summary:Site-level summary statistics
ALLELES_MATCH EVAL_SUPERSET_TRUTH EVAL_SUBSET_TRUTH ALLELES_DO_NOT_MATCH EVAL_ONLY TRUTH_ONLY
5107946 496 418 1985 16077 3190536

HC-UG comparison
GATKTable:4:4:%s:%.3f:%.3f:%.3f:;
GATKTable:GenotypeConcordance_Summary:Per-sample summary statistics: NRS, NRD, and OGC
Sample Non-Reference Sensitivity Non-Reference Discrepancy Overall_Genotype_Concordance
ALL 0.942 0.017 0.987
HMN15-1 0.943 0.017 0.987
HMN15-2 0.942 0.017 0.987
HMN15-3 0.942 0.017 0.986

GATKTable:6:1:%d:%d:%d:%d:%d:%d:;
GATKTable:SiteConcordance_Summary:Site-level summary statistics
ALLELES_MATCH EVAL_SUPERSET_TRUTH EVAL_SUBSET_TRUTH ALLELES_DO_NOT_MATCH EVAL_ONLY TRUTH_ONLY
4875872 10127 14832 22193 203900 1222730

My questions are:
1. Why NRS and NRD is 0.00 and 1.00 when i compare HC and UG result files with hapmap.vcf dataset, and what does it mean?
2. and how i can use these values to evaluate the experimental dataset is more accurate and how much similar? like if NRS value is more what does it mean? or NRD value more or less, what does it mean?
3. As i am new in the NG analysis field i am facing some problems in interpreting the resultant GATK output reports, is therre any documentation available for report produced by "AnalyzeCovariates tool's (before/after recalibrated) PDF output", VQSR tranhes output file (i have attached these outputs)?

I am really sorry for troubling you all.. but i have no other way to get the answer of my questions...

Thank you very much for your continuous support...

Best Answers

Answers

Sign In or Register to comment.