This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Multi-sample SNP calling with UnifiedGenotyper
I've been scouring the forums, but I fear that my question is so basic that I am alone:
I have whole genome sequences of 6 samples (and so 6 .bam files) of a non-model organism and I am trying to compare SNPs for downstream population genetics analyses. I attempted this using UnifiedGenotyper (I realize that HaplotypeCaller is better, but UG finished first, while HC has been running for days at the time of this writing.) Here is what I entered:
java -jar GenomeAnalysisTK.jar -R reference.fasta -T UnifiedGenotyper -I sample01.bam -I sample02.bam -I sample03.bam -I sample04.bam -I sample05.bam -I sample06.bam -o output.raw.snps.indels.vcf
Unless I am mis-reading the output VCF file (first few lines are pasted below), it seems to contain only a single sample (based on the fact that there is only a single column for ALT, rather than one per sample). I tried to use this file in SNPHYLO, but it errors because "There are no SNPs", which seems to confirm this.
What am I doing wrong? Thanks, and apologies for any redundancies.
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TU114
223232 3 . T C 450.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=-2.751;DP=245;Dels=0.00;ExcessHet=3.0103;FS=0.966;HaplotypeScore=4.7345;MLEAC=1;MLEAF=0.500;MQ=18.74;MQ0=51;MQRankSum=-1.838;QD=2.12;ReadPosRankSum=-1.234;SOR=0.965 GT:AD:DP:GQ:PL 0/1:163,50:245:99:479,0,1492
223232 19 . C T 529.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.273;DP=250;Dels=0.00;ExcessHet=3.0103;FS=5.648;HaplotypeScore=10.3851;MLEAC=1;MLEAF=0.500;MQ=21.77;MQ0=32;MQRankSum=0.356;QD=2.12;ReadPosRankSum=1.028;SOR=1.757 GT:AD:DP:GQ:PL 0/1:198,52:250:99:558,0,2443