Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!
Multi-sample SNP calling with UnifiedGenotyper
I've been scouring the forums, but I fear that my question is so basic that I am alone:
I have whole genome sequences of 6 samples (and so 6 .bam files) of a non-model organism and I am trying to compare SNPs for downstream population genetics analyses. I attempted this using UnifiedGenotyper (I realize that HaplotypeCaller is better, but UG finished first, while HC has been running for days at the time of this writing.) Here is what I entered:
java -jar GenomeAnalysisTK.jar -R reference.fasta -T UnifiedGenotyper -I sample01.bam -I sample02.bam -I sample03.bam -I sample04.bam -I sample05.bam -I sample06.bam -o output.raw.snps.indels.vcf
Unless I am mis-reading the output VCF file (first few lines are pasted below), it seems to contain only a single sample (based on the fact that there is only a single column for ALT, rather than one per sample). I tried to use this file in SNPHYLO, but it errors because "There are no SNPs", which seems to confirm this.
What am I doing wrong? Thanks, and apologies for any redundancies.
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TU114
223232 3 . T C 450.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=-2.751;DP=245;Dels=0.00;ExcessHet=3.0103;FS=0.966;HaplotypeScore=4.7345;MLEAC=1;MLEAF=0.500;MQ=18.74;MQ0=51;MQRankSum=-1.838;QD=2.12;ReadPosRankSum=-1.234;SOR=0.965 GT:AD:DP:GQ:PL 0/1:163,50:245:99:479,0,1492
223232 19 . C T 529.77 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.273;DP=250;Dels=0.00;ExcessHet=3.0103;FS=5.648;HaplotypeScore=10.3851;MLEAC=1;MLEAF=0.500;MQ=21.77;MQ0=32;MQRankSum=0.356;QD=2.12;ReadPosRankSum=1.028;SOR=1.757 GT:AD:DP:GQ:PL 0/1:198,52:250:99:558,0,2443