This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!
Exome sequencing - additional public data for variant calling
I'm new to exome sequencing, sorry if the questions have really obvious answers.
My data set contains only 3 different samples from mother, father and daughter.
So far I'm doing the standard thing - IndelRealigner -> HaplotypeCaller -> VariantRecalibrator..
Quesion 1: HaplotypeCaller is recommended. I tried UnifiedGenotyper as well, which outputs about 30% more raw variants. Is that expected?
Question 2: This thread recommends using public data from 1000genomes if the sample size is smaller than 30. Available data sets from 1000GP don't use the Nextera Illumina technology for capture. Is that a problem, should I look for public data that uses the exact same approach as us?
Thanks for your help, I appreciate it ! :-)