# Include no-calls in vcf with only variant sites

Is there a way to include only variant sites and no-calls in your final vcf. I know during SNP calls you can only emit variants, or only confident sites or all. However is there a way to reduce your vcf in the end to only variant sites (vsqr passed) and places where no calls could be made. So the end vcfs have only variant sites and missing data - and everything that is not listed in the vcf file is reference. I need such a file for merging with other vcf files - so that every position that is not in the vcfs while merging can be called ref.

So far i have called snps with emit-all and done vsqr - I now want to reduce vcfs in size by excluding NO_VARINATION sites (but want to keep information on "missing" sites)

I am also interested in obtaining a vcf including only confident varants and sites with missing data. I could not find the recommended workflow for this; could you please direct me to it? I'm working with a haploid genome, and have therefore been using the UnifiedGenotyper followed by VariantFiltation. To obtain data for the missing regions I have up until now relied on grep of a vcf emitting all sites.

I would recommend using DiagnoseTargets to identify sites that cannot be called.

Thank you for your recommendation. Since my goal is to create a consensus fasta file from the vcf; I am however concerned about the correspondence between missing data from DiagnoseTargets and sites with missing data in the UnifiedGenotyper vcf (I am correct to assume that sites with missing data are marked by ./. in the last (bar) column of the vcf?)
I also tried running the DiagnoseTargets tool on my dataset, but got the following error:

