This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Difference between vcf directly generated by HC and vcf generated from GenotypeGVCFs
I have three general questions about using HaplotypeCaller (I know I could have tested by myself, but I figured it might be reliable to get some answer from people who are developing the tool):
- For single sample analysis, is the vcf generated directly from HC the same as the vcf generated using GenotypeGVCFs on the gvcf generated from HC?
- For multi-sample analysis, in terms of speed, how is the performance of running GenotypeGVCFs on each gvcf, compared with combining all gvcfs to run joint-calling, assuming we can get all gvcfs in parallel (say for 500 samples)?
- It seems the gvcf can be generated in two modes,
-ERC BP_RESOLUTION. How different is the one generated using
-ERC BP_RESOLUTIONdifferent from a vcf with all variant calls, reference calls and missing calls? And considering the size of the file, say for NA12878 whole genome, how different it is comparing the gvcf from
-ERC GVCFand the one from
Thank you very much for you attention and any information from you will be highly appreciated.