Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
HaplotypeCaller with very low coverage/inconsistent coverage bam file
I recently had to download some SOLiD datasets (whole genome, non-human) from a paper published in 2010 and perform SNP calling using those reads. I roughly followed the best-practices recommendations and used HaplotypeCaller to call variants after BQSR. The problem is that the number of variants called were far fewer than the number of variants called in the original 2010 paper, and this is before the VariantFiltration step. Although the original paper used a very relaxed method to detect variants, where they called a position polymorphic if 3 independent reads all had the same non-reference nucleotide at that position, they stated that over 95% of the SNPs they reported were true positives. So my question is, what thresholds can I lower so HaplotypeCaller can call more variants? I have already lowered both -stand_emit_conf and -stand_call_conf to 10, would you recommend going even lower? Thanks!