Our staff will be observing the holiday and will be unavailable from 22nd to 25th November. This will cause a delay in reaching out to you and answering your questions immediately. Rest assured we will get back to it on Monday November 26th. We are grateful for your support and patience.
Have a great holiday everyone!!!
I have 50 exome samples belong to 25 families. Do I run GenotypeVCFs on familywise or 50 together?
We have exome sequenced data for 50 samples in total for a cardiac disease. But they have been sequenced in different batches. Even some of the batches were 2 years old. We have relationship information available for these 50 samples. So these 50 samples have been grouped to 25 families, that is each family has 2 samples. Each family relationship can be any one of the following: siblings, sisters, brothers, father & son, and mother & daughter. **Currently, I have GVCFs available for 50 samples. **
As per the article "GATK Tutorial: Variant Callset Evaluation & Filtering", there are two requirements for Variant Quality Score Recalibration (VQSR)
1) GATK requires atleast 30 exome samples or more or 1 whole genome sample
2) Known variant databases
Case1: If I run GenotypeVCFs on each family wise, then I won't be able to filter using VQSR. I need to go for hard filtering. (because I have only 2 exome samples under each family)
Case2: If I run GenotypeVCFs on 50 samples together, then I can filter using VQSR.
Do I need to run "GenotypeVCFs (Joint Calling)" on each family individually or 50 samples together?
If I opt for case2, won't I miss family specific mutations?