Attention:
The front line support team will be unavailable to answer questions until May 27th 2019 as we are celebrating Memorial Day. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

GenotypeConcordance

Could anyone help me with two questions in comparing my vcf file to gold standard?
1. Regarding sites present in my vcf but absent in gold standard, are they ignored?
2. Regarding sites absent in my vcf but present in gold standard, are they assumed 0/0?

Thanks,

Best Answers

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @blueskypy‌

    Hello,

    1) They are ignored.

    2) They are ignored.

    We do not compare records that are missing in either set.

    -Sheila

  • blueskypyblueskypy Member ✭✭

    Thanks Sheila! But I'm confused. My vcf file does not contain non-variant sites. then if those non-variant sites are ignored, will the value of HOM_REF_HOM_REF be 0?

  • blueskypyblueskypy Member ✭✭
    edited October 2014

    if I use -comp v1.vcf -eval v2.vcf -L v3.bed, will ONLY sites present in ALL three files be used in the comparison?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi @‌blueskypy

    That's right, the first two because you can only compare sites for which all information is available, and the third because that sets hard limits on the scope of the analysis.

    Craig/Appistry will happily take any further questions you may have about this and other topics (please see my private message from earlier). Thanks!

  • blueskypyblueskypy Member ✭✭

    hi, Geraldine,
    Sorry, I didn't notice the message! I'll direct future questions to Craig. However, may I bring the following thought because I think it benefits other users too?

    Since GenotypeGVCFs only outputs variant sites by default, and actually it may not work properly with -allSites. 1) The sensitivity computed from such vcf file will be falsely high since HOM_REF_HET and HOM_REF_HOM_VAR are 0; and 2) Even if using the same gold standard, the sensitivities from different input files are not comparable because the denominators are different.

    is my understanding correct?

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Thanks :)

    Current issues notwithstanding, GenotypeGVCFs should/will work properly with -allSites. Remind me what is the problem you've encountered with allSites?

  • blueskypyblueskypy Member ✭✭

    hi,
    Sorry to come back to this question! But I wonder if the set 2 in my original question is NOT ignored and they are actually counted as UNAVAILABLE and is part of the denominator in computing Sensitivity?

Sign In or Register to comment.