# VariantEval on MultiSample calling VCF

I want to know what's the best way to use VariantEval to get statistics for each sample in a multisample VCF file. If I call it like this:
 java -jar GenomeAnalysisTK.jar \ -R ucsc.hg19.fasta \ -T VariantEval \ -o multisample.eval.gatkreport \ --eval annotated.combined.vcf.gz \ --dbsnp dbsnp_137.hg19.vcf 
where annotated.combined.vcf.gz is a VCF file that contains ~1Mio variants for ~800 samples I get statistics for all samples combined, e.g.

 #:GATKReport.v1.1:8 #:GATKTable:11:3:%s:%s:%s:%s:%s:%d:%d:%d:%.2f:%d:%.2f:; #:GATKTable:CompOverlap:The overlap between eval and comp sites CompOverlap CompRod EvalRod JexlExpression Novelty nEvalVariants ... CompOverlap dbsnp eval none all 471704 191147 CompOverlap dbsnp eval none known 280557 0 CompOverlap dbsnp eval none novel 191147 191147 
But I would like to get one such entry per sample. Is there an easy way to do this?

Thanks, I'll give it a try! I tried that one already yesterday, but in combination with some other modules and it said it would take something like 6 days. But with your combination the running time seems to be reasonable.

However, I would like to get some more information on the single modules, but the links on the [gatkforums.broadinstitute.org/discussion/48/using-varianteval](manual page) don't work.

• Posts: 543Member, Dev ✭✭✭✭

Yeah, I don't think there's ever been real comprehensive documentation on them - they kind of fall into that low-priority class with ROD Codecs and VariantAnnotator annotations. I've had pretty good success figuring things out through a combination of source diving and experimentation, though that obviously takes some time and effort

That's right, unfortunately they've just not been a priority -- those links are placeholders for when we eventually get around to documenting them.

