This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Difference in GenotypeGVCFs generated VCF after consolidation with GenomicsDBimport and CombineGVCF
I had a set of total 81 GVCFs that I first consolidated using GenomicsDBimport and then using CombineGVCF and then GenotypeGVCF was run in both cases. For GenomicsDBimport, I ran the command per contig and then I ran GenotypeGVCF on each database to get the final VCF file. Then I used Picard GatherGVCF to make the final VCF. The commands I used are wriiten below:
java -Xmx90g -jar gatk-package-22.214.171.124-local.jar GenomicsDBImport -R water_buffalo_re_arranged_chrom_ref_genome.fa --TMP_DIR ./tmp --sample-name-map sample_names_map_new.txt --reader-threads 2 --genomicsdb-workspace-path "$contig" -L "$contig"
java -Xmx8G -XX:ConcGCThreads=1 -jar gatk-package-126.96.36.199-local.jar GenotypeGVCFs -R /water_buffalo_re_arranged_chrom_ref_genome.fa -new-qual -V gendb://"$contig" -O "$contig"_variants.vcf.gz
java -jar picard.jar GatherVcfs INPUT=list.txt OUTPUT=Final_med_buffalo_variants_81_samples.vcf.gz
java -Xmx200g -XX:ConcGCThreads=1 -jar gatk-package-188.8.131.52-local.jar CombineGVCFs -R water_buffalo_re_arranged_chrom_ref_genome.fa --variant All_gvcf_gz.list -O combined_81.g.vcf.gz
java -Xmx8G -XX:ConcGCThreads=1 -jar gatk-package-184.108.40.206-local.jar GenotypeGVCFs -R water_buffalo_re_arranged_chrom_ref_genome.fa -new-qual -V combined_81.g.vcf.gz -O Final_variants_81_samples_using_CombineGVCF.vcf.gz
The final VCF in both the cases should be the same. Unfortunately, it was not. On running bcftools isec, I found that some variants were common to one VCF and some were in other. What could be the reason behind this discrepancy?
Kindly let me know if you need more information.