Hi GATK Users,

Happy Thanksgiving!
Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!

GATK Staff

GenotypeGVCFs tool gives different output depending on the order of input GVCFs?

serhat_tserhat_t TurkeyMember
edited August 2017 in Ask the GATK team

I have been using GATK GenotypeGVCFs tool (versions 3.5, 3.7 and 4.0). It has come to my attention that depending on the order of input GVCFs, the output slightly changes, i.e. the total number of variants in the output VCF changes. For example, everything else kept constant, the following two command line arguments output slightly different VCFs.

java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample1.g.vcf --variant sample2.g.vcf -o output.vcf
java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample2.g.vcf --variant sample1.g.vcf -o output.vcf

I have observed this in GATK 3.5 and 3.7 versions. GATK 4 for some reason does not work with multiple GVCFs, which I talk about in a different question. There is no parallelization applied whatsoever. Does anyone have any idea what's going on?

Thanks a lot.


Sign In or Register to comment.