Bug Bulletin: we have identified a bug that affects indexing when producing gzipped VCFs. This will be fixed in the upcoming 3.2 release; in the meantime you need to reindex gzipped VCFs using Tabix.

VariantEval on MultiSample calling VCF

thomas_wthomas_w Posts: 13Member

Hi!

I want to know what's the best way to use VariantEval to get statistics for each sample in a multisample VCF file. If I call it like this: java -jar GenomeAnalysisTK.jar \ -R ucsc.hg19.fasta \ -T VariantEval \ -o multisample.eval.gatkreport \ --eval annotated.combined.vcf.gz \ --dbsnp dbsnp_137.hg19.vcf where annotated.combined.vcf.gz is a VCF file that contains ~1Mio variants for ~800 samples I get statistics for all samples combined, e.g.

#:GATKReport.v1.1:8 #:GATKTable:11:3:%s:%s:%s:%s:%s:%d:%d:%d:%.2f:%d:%.2f:; #:GATKTable:CompOverlap:The overlap between eval and comp sites CompOverlap CompRod EvalRod JexlExpression Novelty nEvalVariants ... CompOverlap dbsnp eval none all 471704 191147
CompOverlap dbsnp eval none known 280557 0 CompOverlap dbsnp eval none novel 191147 191147

But I would like to get one such entry per sample. Is there an easy way to do this?

Thanks, Thomas

Best Answer

Answers

  • thomas_wthomas_w Posts: 13Member

    Thanks, I'll give it a try! I tried that one already yesterday, but in combination with some other modules and it said it would take something like 6 days. But with your combination the running time seems to be reasonable.

    However, I would like to get some more information on the single modules, but the links on the [gatkforums.broadinstitute.org/discussion/48/using-varianteval](manual page) don't work.

  • pdexheimerpdexheimer Posts: 297Member ✭✭✭

    Yeah, I don't think there's ever been real comprehensive documentation on them - they kind of fall into that low-priority class with ROD Codecs and VariantAnnotator annotations. I've had pretty good success figuring things out through a combination of source diving and experimentation, though that obviously takes some time and effort

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 5,230Administrator, GSA Member admin

    That's right, unfortunately they've just not been a priority -- those links are placeholders for when we eventually get around to documenting them.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.