Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Exome sequencing - additional public data for variant calling
I'm new to exome sequencing, sorry if the questions have really obvious answers.
My data set contains only 3 different samples from mother, father and daughter.
So far I'm doing the standard thing - IndelRealigner -> HaplotypeCaller -> VariantRecalibrator..
Quesion 1: HaplotypeCaller is recommended. I tried UnifiedGenotyper as well, which outputs about 30% more raw variants. Is that expected?
Question 2: This thread recommends using public data from 1000genomes if the sample size is smaller than 30. Available data sets from 1000GP don't use the Nextera Illumina technology for capture. Is that a problem, should I look for public data that uses the exact same approach as us?
Thanks for your help, I appreciate it ! :-)