It looks like you're new here. If you want to get involved, click one of these buttons!
The technical docs in UnifiedGenotyper state:
The GATK Unified Genotyper is a multiple-sample, technology-aware SNP and indel caller. It uses a Bayesian genotype likelihood model to estimate simultaneously the most likely genotypes and allele frequency in a population of N samples, emitting an accurate posterior probability of there being a segregating variant allele at each locus as well as for the genotype of each sample.
When run with multiple -I inputs, in what ways does the UG algorithm utilize the multiple samples? Does it merely assess each locus in each sample based on the pileup there (and possibly window around it) in isolation, or does it somehow take into account the other samples? Does running it with multiple samples enable it to somehow find variants in some samples that it wouldn't otherwise find, if run on each sample individually?
While I do realize that running in multisample mode allows it to output a nice multi-column VCF format of integrated sample information, is this anything more than just a formatting convenience, the same result that you would achieve if you did the following: