Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
A mismatch between BAM file and the final SNP sequence.
We performed whole exome sequencing for 7 individuals from the same family (parents and 5 children). The company who has done the test used the GATK software.
In the analysis we looked for unique risk factors variants in the affected children. We found some SNPs that were unique to some of the children and were missing in the parents. In other words, the parents were homozygous for certain SNP (AA) and some of the children were heterozygous for the same SNP (AC). We thought it might be caused by a de novo change.
In order to check this, we examined the BAM files at the SNP location for the parents and the children and found:
Mother: number of reads A-73 (76.8%); C-22 (23.2%). The final SNP received by software was AA.
Father: number of reads A-76 (81.7%); C-17 (18.3%). The final SNP received by software was AA
Child 1: number of reads A-56 (76.7%); C-17 (23.3%). The final SNP received by software was AC
Child 2: number of reads A-45 (68.2%); C-21 (31.8%). The final SNP received by software was AA
I would appreciate if someone can explain to me:
What are the parameters that affect and determine the final decision of the SNP sequence?
Why the final SNP sequence is different although the distribution of the reads is the same for all members of the family (especially the mother and child 1)?
What cause the discrepancy?
Thank you from a new beginner in this interesting area