Cohort calling-multiple samples as input

Lisa0508Lisa0508 Ann Arbor, MIMember

Hi GATK team,
I was calling variants individually to generate gVCF files for each sample. Then I thought maybe I should also try using multiple samples as input from the ‘RealignerTargetCreator ‘ step to generate a joint bam file that contained alignment from different samples (I used different SM tag for each sample). I paste my command below. So in the ‘HaplotypeCaller’ step, I got the message ‘emitRefConfidence has a bad value’ because my bam file is mixed with different SM tag. I suppose I shouldn't change SM tags into the same, or I will not be able to identify what variants are from which individual. I think that I misunderstood the meaning of cohort calling from the beginning. So I want to clarify two points to see if I am understanding correctly now. 1.The’ HaplotypeCaller’ will only use the SM tag information to identify different individuals. Other tags like ID, PL, LB will not be considered. If all samples from different individuals have the same SM tag,HC will treat it as from one individual. Is it correct? 2. My target is to find variants between individuals, rather than in all individuals. Then I should call variant one sample per time and run ‘GenotypeGVCFs’ to join all samples together before hard filtering. Do I still misunderstand something? Thanks.
The following is the command line I used.
java -Xmx32g -jar $GATK_JARS/GenomeAnalysisTK.jar \
-T RealignerTargetCreator \
-R ucsc.hg19.fasta \
-I aligned_TKSAHB.dedup.sorted.bam \
-I aligned_TKSAHV.dedup.sorted.bam \
-I aligned_TKSASA.dedup.sorted.bam \
-known Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \
-known 1000G_phase1.indels.hg19.sites.vcf \
-o target_interval_TKSA.list \
&& java -Xmx32g -jar $GATK_JARS/GenomeAnalysisTK.jar \
-T IndelRealigner \
-R ucsc.hg19.fasta \
-I aligned_TKSAHB.dedup.sorted.bam \
-I aligned_TKSAHV.dedup.sorted.bam \
-I aligned_TKSASA.dedup.sorted.bam \
-targetIntervals target_interval_TKSA.list \
-known Mills_and_1000G_gold_standard.indels.hg19.sites.vcf \
-known 1000G_phase1.indels.hg19.sites.vcf \
-o realigned_TKSA.dedup.sorted.bam

Best Answer

Answers

Sign In or Register to comment.