it's not accurate to use only one HaplotypeCaller gVCF as the input of GenotypeGVCF ？
if yes , how to prepare more gVCFs for my sample ? I just care the only one sample gVCF .
We have noticed some differences between genotypes coming from genotypeGVCF on only one individual GVCF and the genotypes of the same individual called through a combineGVCF + genotypeGVCF on a cohort of people. If we look only on the common sites between the wo vcf files, the genotypes are sometimes different for the same variant (especially when DP is low). Is it on purpose ? Which is the most reliable calling ?
Thank you for your answer
I need look unique sites with individual ,not only common sites.
If you have only one sample and you are only interested in that sample's variants, you can either run HaplotypeCaller in normal mode or in GVCF mode. The GVCF workflow helps when you have many samples to analyze together because it saves compute or when you will get more samples to add to your analysis later on. Have a look at this article.
P.S. If you would like to add more samples to your analysis, you can get data from the 1000Genomes project.
Edit: Also, have a look at this thread.
Have a look at this article, specifically after "Why "almost always"?".
It is true that for low DP sites the tool will not be able to emit a high-confidence genotype. That would explain the differences you are seeing. The rest of the article I linked to on top will explain some other reasons. You may also be interested in the new QUAL calculation that helps with the missing singletons. You can invoke it with --useNewAFCalculator.