We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

What do the empty counts in the ContingencyMetrics output file from GenotypeConcordance signify?

nravoorunravooru PittsburghMember
edited October 2019 in Ask the GATK team
I have two VCF files produced from two different protocols that I want to compare. I used a set of hard filters on both the VCF files with one being a truth set and comparing another VCF against it. Since both VCF files have a filter status (either PASS etc); when I compare the two VCF files: I have TP, TN and empty counts - no FP or FN.
When I compare the two VCFs prior to the filtering using genotypeConcordance, I have only TP and TN with no empty counts.
By setting the ignore_filter_status as FALSE; does it compare the passed variants from the call set against all the variants from the truth set (or) does genotype concordance compare the passed variants from the call set to the passed variants from the truth set. From the documentation; empty counts state that there was no contingency information for those number of variants which is very speculative. If the latter is true: shouldn't it be false positives if a passed variant was found in the call set and not present in the truth set / false negatives if there were certain passed variants in the truth set not found in the call set.

I am using SelectVariants and VariantFiltration to impose the hard filters on both the VCF files. The hard filters are the same for the two files but different for SNPs and INDELs. I am comparing SNPs and INDELS separately in GenotypeConcordance.



Sign In or Register to comment.