The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!
Multi-sample SNP calling with UnifiedGenotyper
I've been scouring the forums, but I fear that my question is so basic that I am alone:
I have whole genome sequences of 6 samples (and so 6 .bam files) of a non-model organism and I am trying to compare SNPs for downstream population genetics analyses. I attempted this using UnifiedGenotyper (I realize that HaplotypeCaller is better, but UG finished first, while HC has been running for days at the time of this writing.) Here is what I entered:
java -jar GenomeAnalysisTK.jar -R reference.fasta -T UnifiedGenotyper -I sample01.bam -I sample02.bam -I sample03.bam -I sample04.bam -I sample05.bam -I sample06.bam -o output.raw.snps.indels.vcf
Unless I am mis-reading the output VCF file (first few lines are pasted below), it seems to contain only a single sample (based on the fact that there is only a single column for ALT, rather than one per sample). I tried to use this file in SNPHYLO, but it errors because "There are no SNPs", which seems to confirm this.
What am I doing wrong? Thanks, and apologies for any redundancies.
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TU114
223232 3 . T C 450.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=-2.751;DP=245;Dels=0.00;ExcessHet=3.0103;FS=0.966;HaplotypeScore=4.7345;MLEAC=1;MLEAF=0.500;MQ=18.74;MQ0=51;MQRankSum=-1.838;QD=2.12;ReadPosRankSum=-1.234;SOR=0.965 GT:AD:DP:GQ:PL 0/1:163,50:245:99:479,0,1492
223232 19 . C T 529.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.273;DP=250;Dels=0.00;ExcessHet=3.0103;FS=5.648;HaplotypeScore=10.3851;MLEAC=1;MLEAF=0.500;MQ=21.77;MQ0=32;MQRankSum=0.356;QD=2.12;ReadPosRankSum=1.028;SOR=1.757 GT:AD:DP:GQ:PL 0/1:198,52:250:99:558,0,2443