We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GenotypeGVCFs tool gives different output depending on the order of input GVCFs?

serhat_tserhat_t TurkeyMember
edited August 2017 in Ask the GATK team

I have been using GATK GenotypeGVCFs tool (versions 3.5, 3.7 and 4.0). It has come to my attention that depending on the order of input GVCFs, the output slightly changes, i.e. the total number of variants in the output VCF changes. For example, everything else kept constant, the following two command line arguments output slightly different VCFs.

java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample1.g.vcf --variant sample2.g.vcf -o output.vcf
java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample2.g.vcf --variant sample1.g.vcf -o output.vcf

I have observed this in GATK 3.5 and 3.7 versions. GATK 4 for some reason does not work with multiple GVCFs, which I talk about in a different question. There is no parallelization applied whatsoever. Does anyone have any idea what's going on?

Thanks a lot.


Sign In or Register to comment.