If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.
WES analysis approach
I am analysing WES data of 52 samples of rare mendelian disorders (mostly trios and small families). During the paired end sequencing mostly each trio is processed on single lane which is if there is father mother and proband so they have the same flowcell number + Lane (C5JYMACXX_2.1_1 & C5JYMACXX_2.1_2 for father, mother and Proband as its a paired end) i.e. multiple sample on one lane and the Sm is different for all the three samples. Right now I am performing the pre processing steps that is BWA, Picard individually on each sample. I have gone through couple of discussions about the read groups and base recalibration. I am pretty confused about the following steps please suggest me: First of all if you have multiple samples on single lane then they will have the same RGID but different sample number. So should I process each family individually i.e. running the GATK Realign Target Creator and Indel Realigner on the trios together followed by the Base Recalibrator where I will use -I option and will input all the bam files for each trio (single bam file for each trio i.e 17 families is equal to 17 bam files). Finally when I will have multisample bam file which will be pass down to Haplotype caller and VQSR. Please suggest is my approach correct or I am missing out something.