Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How to distinguish between Missing and ReF in gVCF

Dear all,

I have a naive query which might have discussed earlier. I tried to find in the forum but did not succeed.

Consider gVCF files produced for 3 different samples (single-sample variant calling) and genotyping gVCF to VCF generates the list of only variant sites. When it is required to find the shared variants between 3 samples, if one of the sample has no variant at that particular site in the VCF file, how could it be interpreted, Is it missing due to lack of reads or REF?


Best Answer


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    In the VCF file this will be distinguished by either a hom-ref call (0/0) or a no-call (./.) if there was no data there.

  • meharmehar Member ✭✭

    I understand that (0/0) is for hom-ref when the data supports the REF allele. And (./.) is for missing data. My query is when gvcf is converted to vcf does these missing calls are also converted or only the variant calls? And if it not in the vcf it could mean that it is missing or REF allelle and how do we know that whether it is ref or missing?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin


    When you get a ./. for a genotype, it means we do not have enough information at the site to make a confident call. It could due to missing data or not enough confidence in any one genotype (e.g. all PLs are 0). The ./. in the GVCF does not change in the VCF because we simply do not have enough information to make an informed call at the site.

    I hope this helps.


Sign In or Register to comment.