If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
"GVCFs produced by HaplotypeCaller" vs "VCFs produced by UnifiedGenotyper (EMIT_ALL_SITES option)"
Hi GATK team, i would like to seek opinion from your team to find the best workflow that best fit my data.
Previously i've been exploring both variant calling algorithms UnifiedGenotyper and HaplotypeCaller, and i would love to go for UnifiedGenotyper considering of the sensitivity and the analysis runtime.
Due to my experimental cohort samples grows from time to time, so i've opt for single sample calling follow by joint-analysis using combineVariants instead of doing multiple-samples variant calling. However by doing so, i've experience few drawbacks from it (this issue was discussed at few forums). For a particular SNP loci, we wouldn't know whether the "./." reported for some of the samples are due to no reads covering that particular loci, or it doesn't pass certain criteria during variant calling performed previously, or it is a homo-reference base (which i concern this most and can't cope to lost this information).
Then, i found this "gvcf", and it is potentially to solve my problem (Thanks GATK team for always understand our researcher's need)!!
Again, i'm insist of opt for unifiedGenotyper instead of haplotypeCaller to generate the gvcf, and reading from the forum at https://www.broadinstitute.org/gatk/guide/tagged?tag=gvcf, i would assume that as VCFs produced by "UnifiedGenotyper with --output_mode EMIT_ALL_SITES" to be something alike with the gvcf file produced by HaplotyperCaller. However i couldn't joint them up using either "CombineVariants" or "CombineGVCFs", most probably i think "UnifiedGenotyper with --output_mode EMIT_ALL_SITES" doesn't generate gvcf format.
Can you please give me some advice to BEST fit my need and with minimum runtime (UnifiedGenotyper would be my best choice), is there any method to joint the ALL_SITES vcf file produced by UnifiedGenotyper which i might probably missed out from the GATK page?