It looks like you're new here. If you want to get involved, click one of these buttons!
Hi,
I have a bam file with multiple read groups for same sample. Does variant calling algorithm (UnifiedGenotyper) will consider bam file as multiple-sample data or a single sample-data (irrespective of read groups) for calling varaints?
Eg.
Read Groups in BAM file:
@RG ID:41852 PL:illumina PU:41852 LB:nolib SM:41852p
@RG ID:41852.1 PL:illumina PU:41852 LB:nolib SM:41852s
@RG ID:41853 PL:illumina PU:41853 LB:nolib SM:41853s
@RG ID:41854 PL:illumina PU:41854 LB:nolib SM:41854p
@RG ID:41854.4 PL:illumina PU:41854 LB:nolib SM:41854s
@RG ID:41855 PL:illumina PU:41855 LB:nolib SM:41855p
@RG ID:41855.6 PL:illumina PU:41855 LB:nolib SM:41855s
Variants call in VCF file
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 41852p 41852s 41853s 41854p 41854s 41855p 41855s
chrM 150 . T C 41679.01 PASS ABHom=0.999;AC=14;AF=1.00;AN=14;BaseQRankSum=1.362;DP=1255;DS;Dels=0.00;FS=0.000;HaplotypeScore=4.6324;MLEAC=14;MLEAF=1.00;MQ=40.73;MQ0=1;MQRankSum=-0.678;OND=3.149e-03;QD=33.21;ReadPosRankSum=1.479;SB=-2.052e+04 GT:AD:DP:GQ:PL 1/1:0,200:200:99:6075,544,0 1/1:0,200:200:99:6315,568,0 1/1:0,200:200:99:7094,599,0 1/1:1,197:198:99:6820,547,0 1/1:0,113:113:99:3624,322,0 1/1:0,200:200:99:7111,599,0 1/1:0,141:141:99:4640,403,0
Regards
Gaurav
Comments
Hi there,
The caller will consider reads that have the same sample name (SM) together, even if they belong to different read groups. But the sample name must be the same. In your example file it seems your sample names are all different, with "p" or "s" differentiating samples that have the same number. If you want those to be taken together, you will need to modify the sample names to lose the p or s.
Geraldine Van der Auwera, PhD
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •