To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

Multi-sample SNP calling with UnifiedGenotyper

I've been scouring the forums, but I fear that my question is so basic that I am alone:
I have whole genome sequences of 6 samples (and so 6 .bam files) of a non-model organism and I am trying to compare SNPs for downstream population genetics analyses. I attempted this using UnifiedGenotyper (I realize that HaplotypeCaller is better, but UG finished first, while HC has been running for days at the time of this writing.) Here is what I entered:

java -jar GenomeAnalysisTK.jar -R reference.fasta -T UnifiedGenotyper -I sample01.bam -I sample02.bam -I sample03.bam -I sample04.bam -I sample05.bam -I sample06.bam -o output.raw.snps.indels.vcf

Unless I am mis-reading the output VCF file (first few lines are pasted below), it seems to contain only a single sample (based on the fact that there is only a single column for ALT, rather than one per sample). I tried to use this file in SNPHYLO, but it errors because "There are no SNPs", which seems to confirm this.

What am I doing wrong? Thanks, and apologies for any redundancies.


223232 3 . T C 450.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=-2.751;DP=245;Dels=0.00;ExcessHet=3.0103;FS=0.966;HaplotypeScore=4.7345;MLEAC=1;MLEAF=0.500;MQ=18.74;MQ0=51;MQRankSum=-1.838;QD=2.12;ReadPosRankSum=-1.234;SOR=0.965 GT:AD:DP:GQ:PL 0/1:163,50:245:99:479,0,1492
223232 19 . C T 529.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.273;DP=250;Dels=0.00;ExcessHet=3.0103;FS=5.648;HaplotypeScore=10.3851;MLEAC=1;MLEAF=0.500;MQ=21.77;MQ0=32;MQRankSum=0.356;QD=2.12;ReadPosRankSum=1.028;SOR=1.757 GT:AD:DP:GQ:PL 0/1:198,52:250:99:558,0,2443

Best Answer


Sign In or Register to comment.