Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
GenotypeGVCFs tool gives different output depending on the order of input GVCFs?
I have been using GATK GenotypeGVCFs tool (versions 3.5, 3.7 and 4.0). It has come to my attention that depending on the order of input GVCFs, the output slightly changes, i.e. the total number of variants in the output VCF changes. For example, everything else kept constant, the following two command line arguments output slightly different VCFs.
java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample1.g.vcf --variant sample2.g.vcf -o output.vcf
java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample2.g.vcf --variant sample1.g.vcf -o output.vcf
I have observed this in GATK 3.5 and 3.7 versions. GATK 4 for some reason does not work with multiple GVCFs, which I talk about in a different question. There is no parallelization applied whatsoever. Does anyone have any idea what's going on?
Thanks a lot.