WES analysis approach
I am analysing WES data of 52 samples of rare mendelian disorders (mostly trios and small families). During the paired end sequencing mostly each trio is processed on single lane which is if there is father mother and proband so they have the same flowcell number + Lane (C5JYMACXX_2.1_1 & C5JYMACXX_2.1_2 for father, mother and Proband as its a paired end) i.e. multiple sample on one lane and the Sm is different for all the three samples. Right now I am performing the pre processing steps that is BWA, Picard individually on each sample. I have gone through couple of discussions about the read groups and base recalibration. I am pretty confused about the following steps please suggest me: First of all if you have multiple samples on single lane then they will have the same RGID but different sample number. So should I process each family individually i.e. running the GATK Realign Target Creator and Indel Realigner on the trios together followed by the Base Recalibrator where I will use -I option and will input all the bam files for each trio (single bam file for each trio i.e 17 families is equal to 17 bam files). Finally when I will have multisample bam file which will be pass down to Haplotype caller and VQSR. Please suggest is my approach correct or I am missing out something.