Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Standard practice for VCF filtering for the purpose of fingerprinting via proportion IBS?
I was wondering whether I could get some insight into what the standard procedure for `fingerprinting' by prop. IBS for sequencing data is.
To be more precise: the proportion IBS between the called variants from sequencing data between two samples depends on the relationship between the individuals from which the samples were taken. For example, the prop. IBS between two samples taken from the same person should be higher than that between siblings, which in turn should be higher than that between two unrelated people.
It should therefore be possible to classify two individuals whose relationship is unknown into at least one of the three aforementioned bins (self-self, self-sibling, self-unrelated). My attempt to do this with a ~400 sample size of targeted amplicon sequence data yields fairly good separation by simply filtering by MAF of 0.05, but there is still a lot of overlap (see attached image):
My question is whether there is a battery of filters to be applied to the VCFs that is generally accepted to be good practice for such a use.